Face Recognition and People Counting on the Edge

The importance of “customer traffic and behavior” for retailers.

Consumer insights and behavior analysis can help retailers evaluate the performance of outlets and make decisions to increase sales.

The Challenge

Footfall and customer behavior is an important data metric for retailers across the globe. Footfall can give information on the results of a marketing strategy, conversion rates, and advertising effects.

Intelligence such as prime traffic times, counters with maximum footfall can result in strategic decisions taken by the retailers on labor resources and marketing spend. For retailers, traffic count defines sales, and every retailer store is looking at maximizing this number.

The intelligence generated through this data can help retailers increase their sales, improve operational efficiency and enhance the customer experience.

ARM + FPGA Architecture                                                               8 IP Cameras Connectivity
Integrated DPU &Vitis™ AI stack                                                   Extended Storage & Recording
Wi-Fi / 4G Communication                                                           Easy to use API


With a growing requirement for taking decisions on the edge and data privacy concerns, there is a requirement for intelligent devices capable of taking real-time decisions. Catering to this requirement, iWave has developed Corazon-AI, An Edge AI platform

Corazon-AI provides software developers and system integrators an ideal platform to develop AI applications on the edge. With the Zynq UltraScale+ MPSoC series providing the feature to connect 8 IP Cameras and the Xilinx Vitis™ AI stack integrated, Corazon-AI provides for an ideal fit helping build intelligence on the edge. The Xilinx Vitis AI Stack includes advanced pre-optimized deep learning models from the mainstream frameworks such as Tensor-flow, Caffe, Darknet, and Computer Vision Libraries. The Xilinx Vitis AI Stack enables developers to accelerate the development flow of AI applications even without in depth-knowledge of FPGA and deep learning. The Stack support C++/python API’s which provides programming flexibility to the developers.

Corazon-AI powers a dedicated AI inference engine and Deep learning processing unit(DPU) implemented on the programmable logic(PL) side of the device. The AI inference engine has a configurable computation engine optimized for convolution neural networks (CNN) such as SSD, Yolo, Resnet, VGG, etc.

Face Recognition and People Counting on Corazon-AI

The application on Corazon-AI was developed based on the parallel computation architecture that was optimized to provide accurate face recognition and people counting models. The high-resolution video stream interface, such as USB, HD-SDI, IP camera, and stored video file was provided as an input to Corazon-AI. The pre-processing functions like decoding and video scaling (as per CNN algorithm requirements) are implemented on FPGA to accelerate the performance of the application.

Xilinx DPU AI Engine image

The face recognition application pipeline was typically divided into two-phases

Phase 1: Face Detection

  • We detect if a new face has entered the surveillance area
  • Send the N-newly detected faces (x,y)-coordinates to the face feature extraction algorithm

Phase 2: Face Recognition

  • Perform the AI inference to extract the unique 512 points as a feature for the face recognition
  • Compare the novel face feature with the pre-stored local database

The people counting application had people detection CNN inference running on the FPGA and an object tracking algorithm running on ARM soft cores.

The people counting application is divided into two phases.
Phase 1: Object detection

  • We detect if a new object has entered the user-defined area
  • Send the (x, y)coordinates of the detected object to the people tracking algorithm
    Phase 2: Object Tracking

Accepts the (x-y)-coordinates of the object of where an object is in an image and:

  • Assign a unique tracking ID to the object
  • Track the object as it moves around a video frame, predicting the new location of the object in the next frame based on various attributes of the object in a frame such as an object flow, Euclidean distance etc..
  • Increment or decrement the people count based on the location of the object inside the user-defined area in the frame

Heat mapping analytics can be used to determine the best and worst-performing areas in the store, helping them improve their services and operational efficiency.
The overall performance of the application was ~22fps.


Copyright © 2022 iWave Systems Technologies Pvt. Ltd.