PySDK guide: Object counting in polygon zones using DeGirum PySDK

Introduction

Object detection on a video stream with object counting in a polygon zone is an application of
artificial intelligence (AI) that finds extensive use in various fields, including surveillance, traffic management, and industrial automation. This technology leverages advanced computer vision techniques to identify and track objects within a specified area of interest.

Object detection involves the use of AI models, such as convolutional neural networks (CNNs), to analyze video frames and identify objects present within them. In the context of video streams, these models analyze successive frames to track the movement and changes in object positions over time.

A polygon zone is a defined area of interest within the video frame where object counting needs to occur. It is usually specified by a set of vertices that form a closed shape. The application of a polygon zone allows for focused analysis, restricting object counting to a specific region within the video stream.

The implementation of object counting within a polygon zone involves the following steps:

  • Polygon definition: Define the polygon zone by specifying its vertices. This defines the region where object counting will take place.

  • Frame analysis: Process each frame of the video stream using the object detection model. Identify and classify objects present in the frame.

  • Polygon intersection: Check whether the detected objects intersect with the defined polygon zone. This step ensures that only objects within the specified region are considered for counting.

  • Counting logic: Establish counting logic based on the number of objects within the polygon zones.

  • Real-time updates: Provide real-time updates on the object count within the polygon zone. This information can be utilized for various applications, such as monitoring crowd density, traffic flow, or object movement in a manufacturing environment.

Applications

  • Surveillance: Monitor specific areas within a surveillance camera’s field of view, counting and tracking objects for security purposes.

  • Traffic management: Analyze vehicle movement within designated zones to optimize traffic flow and detect congestion.

  • Industrial automation: Count and track objects on factory floors to enhance production efficiency and safety.

  • Retail analytics: Monitor customer movement and product interactions within specific areas of a retail store for marketing and inventory management.

Using DeGirum PySDK for object zone counting

Prerequisites

Assuming you have configured a Python environment on your Windows, Linux, or MacOS computer, you’ll need to install DeGirum PySDK and DeGirum Tools Python packages by running the following commands (follow this link for details):

pip install -U degirum
pip install -U degirum_tools

Alternatively, you can use Google Colab to run the zone counting example Jupyter notebook provided by DeGirum.

Step-by-step guide

Import necessary packages

You’ll need to import degirum and degirum_tools packages:

import degirum as dg, degirum_tools

Select object detection AI model

As a starter, we will use the YOLOv8n COCO model, which can detect 80 COCO classes. We will take this model from the DeGirum Hailo cloud model zoo.

Lets define the cloud zoo URL and the model name:

model_zoo_url = "degirum/hailo"
model_name = "yolov8n_coco--640x640_quant_hailort_hailo8_1"

Here degirum/hailo is the path to DeGirum cloud public model zoo with models for Hailo AI accelerators.

yolo_v5s_coco--512x512_quant_n2x_orca1_1 is the model name we will use for object detection. It is based on the YOLOv8 Ultralytics model trained to detect 80 COCO classes and compiled for the Hailo8 AI hardware accelerator.

Define video source

For simplicity, we will use a short highway traffic video from the DeGirum PySDK examples GitHub repo:

video_source = "https://github.com/DeGirum/PySDKExamples/raw/main/images/Traffic.mp4"

However, you can use any video file you want. If you run the code locally and your computer has a video camera, you may use the video camera as a video source:

video_source = 0 # specify index of local video camera

Define polygon zones

For each zone in which you want to count objects, you’ll need to define the list of [x,y] pixel coordinates of polygon vertices which surround that zone. Then, you’ll define a list containing all zone polygons:

polygon_zones = [
    [[265, 260], [730, 260], [870, 450], [120, 450]], # zone 1
    [[400, 100], [610, 100], [690, 200], [320, 200]], # zone 2
]    

Here, we defined two zones with four vertices in each zone.

Obtain cloud API access token

In order to use AI models from the DeGirum AI Hub, you’ll need to register and generate a cloud API access token. Please follow these instructions. Registration is free.

Connect to model zoo and load the model

# connect to AI inference engine
zoo = dg.connect(dg.CLOUD, model_zoo_url, "<cloud API token>")

# load model
model = zoo.load_model(model_name, output_class_set={"car", "motorbike", "truck"})

Here, we connect to the DeGirum AI Hub to run AI model inferences (by using dg.CLOUD parameter) and to the cloud model zoo specified by model_zoo_url using the cloud API access token obtained in the previous step. Then, we load a model specified by model_name and restrict the model output to the subset of classes.

For more inference options, please refer to this documentation page.

Define interactive display

If you would like to observe live real-time results of object detection and zone counting with AI annotations overlay, you may use the Display class from the DeGirum Tools package:

with degirum_tools.Display("AI Camera") as display:
    ...

Note: Interactive display is not updated in real-time in a Colab environment.

Define zone counting object

We will use the ZoneCounter object from the DeGirum Tools package, which greatly simplifies the task of object zone counting.

with degirum_tools.Display("AI Camera") as display:
    zone_counter = degirum_tools.ZoneCounter(
        polygon_zones,
        triggering_position=degirum_tools.AnchorPoint.CENTER,
        window_name=display.window_name,
    )

We’ll specify the triggering position to be the center of the object bounding box: triggering_position=degirum_tools.AnchorPoint.CENTER. Possible triggering positions are all four vertices and all four centers of the bounding box edges.

If you want to adjust polygon zones on interactive display during run-time, you can specify the OpenCV window name of that interactive display: window_name=display.window_name. You can drag the whole zone polygon by clicking your left mouse button and you can move polygon vertices by clicking your right mouse button.

Press the space button to pause/resume video stream.

Press ‘x’ or ‘q’ button to finish streaming.

Define inference loop

    with degirum_tools.open_video_stream(video_source) as stream:
        for result in model.predict_batch(
            degirum_tools.video_source(stream)
        ):
            img = result.image_overlay
            img = zone_counter.analyze_and_annotate(result, img)
            display.show(img)

We open the video stream using degirum_tools.open_video_stream.

Then, we supply a video source to the input of model.predict_batch method, which performs AI model inference on each frame retrieved from the video stream in an effective pipelined manner.

The result object contains the inference results, which are stored in the result.results list.

The result.image_overlay method draws AI annotations on top of the original video frame image. These annotations include bounding boxes for all detected objects.

The zone_counter.analyze_and_annotate method counts objects in polygon zones and draws object counts on top of the provided image.

Finally, display.show(img) displays a fully annotated image in an OpenCV interactive window.

To simplify the boilerplate code, you may use the degirum_tools.predict_stream function which effectively performs the same steps as above:

for result in degirum_tools.predict_stream(
    model, video_source, analyzers=zone_counter
):

    display.show(result)

Access zone counting results

If you want to know which object belongs to which zone, you may analyze the "in_zone" key of each result dictionary. Its value is the list of Boolean flags, one per zone. The flag is True if the object is detected in the corresponding zone.

If you want to obtain per-zone counts, this information is stored in the result.zone_counts list of dictionaries. This list contains one element per defined polygon zone. Each dictionary contains the count of objects detected in the corresponding zone. The key is the class name and the value is the count of objects of this class detected in the zone. If the per_class_display constructor parameter is False, the dictionary contains only one key "total", whose value is the total object count in the corresponding zone.

Full code example

import degirum as dg, degirum_tools

model_zoo_url = "degirum/hailo"
model_name = "yolov8n_coco--640x640_quant_hailort_hailo8_1"
video_source = "https://github.com/DeGirum/PySDKExamples/raw/main/images/Traffic.mp4"

polygon_zones = [
    [[265, 260], [730, 260], [870, 450], [120, 450]],  # zone 1
    [[400, 100], [610, 100], [690, 200], [320, 200]],  # zone 2
]

# connect to AI inference engine
zoo = dg.connect(dg.CLOUD, model_zoo_url, degirum_tools.get_token())

# load model
model = zoo.load_model(model_name, output_class_set={"car", "motorbike", "truck"})

with degirum_tools.Display("AI Camera") as display:
    zone_counter = degirum_tools.ZoneCounter(
        polygon_zones,
        triggering_position=degirum_tools.AnchorPoint.CENTER,
        window_name=display.window_name,
    )

    for result in degirum_tools.predict_stream(
        model, video_source, analyzers=zone_counter
    ):
        display.show(result)
        print(result.zone_counts)
1 Like