DeGirum PySDK enables seamless integration of AI models into real-time applications, including live video analysis from RTSP-enabled cameras. This guide walks you through processing and displaying AI inference results dynamically from an RTSP stream using the YOLOv8 object detection model.
Prerequisites
DeGirum PySDK: Installed and configured on your system. See DeGirum/hailo_examples for instructions.
RTSP camera stream: Obtain the RTSP URL of your camera. Replace username, password, ip, and port in the script with your camera’s credentials.
Token: If using cloud inference, ensure you have a valid token. For local inference, leave the token empty.
Script overview
This script:
Loads the YOLOv8 object detection model.
Processes the RTSP video stream to detect objects in real time.
Displays the inference results dynamically in a dedicated window.
Code example
import degirum as dg, degirum_tools
# Choose inference host address
inference_host_address = "@cloud"
# inference_host_address = "@local"
# Choose zoo_url
zoo_url = "degirum/models_hailort"
# zoo_url = "../models"
# Set token
token = degirum_tools.get_token()
# token = '' # Leave empty for local inference
# Specify the AI model and video source
model_name = "yolov8n_relu6_coco--640x640_quant_hailort_hailo8l_1"
video_source = "rtsp://username:password@ip:port/cam/realmonitor?channel=1&subtype=0" # Replace with your camera RTSP URL
# Load the AI model
model = dg.load_model(
model_name=model_name,
inference_host_address=inference_host_address,
zoo_url=zoo_url,
token=token,
)
# Run AI inference on the video stream and display the results
with degirum_tools.Display("AI Camera") as output_display:
for inference_result in degirum_tools.predict_stream(model, video_source):
output_display.show(inference_result)
Steps to run the script
Set up the RTSP stream:
Replace the video_source string with the RTSP URL of your camera.
Example format: rtsp://username:password@ip:port/cam/realmonitor?channel=1&subtype=0.
Configure inference:
Use @cloud for cloud inference or @local for local device inference.
Specify the appropriate zoo_url for accessing your model zoo.
Load the model:
Replace model_name with your desired model if you want to detect objects other than the default YOLOv8 configuration.
Run the script:
Execute the script to process the RTSP feed in real time.
The detected objects will be displayed dynamically in the window labeled “AI Camera.”
Stop the display:
Press x or q to exit the display window.
Applications
Surveillance: Monitor live feeds for security and safety.
Traffic analysis: Analyze vehicles and pedestrians in real time.
Industrial monitoring: Detect objects in manufacturing or warehouse operations.
Additional resources
For more examples and advanced use cases, visit our Hailo Examples Repository. This repository provides scripts and guidance for deploying AI models on various hardware configurations.
Hello @shashi , good day to you.
Thank you for sharing the above guide.
I am trying to implement the same, however I get the following errors while executing it,
“error while decoding ”
“left block unavailable for requested intra mode ”
“cabac decode of qscale diff failed at ”
import degirum as dg, degirum_tools
#Basic setup
inference_host_address = "@local"
zoo_url = "/home/pi/tests/models"
model_name = "yolov8n_relu6_human_head--640x640_quant_hailort_hailo8l_1"
video_source='rtsp://192.168.100.54:8554/cam'
print("Load Model..")
model = dg.load_model(
model_name = model_name,
inference_host_address = inference_host_address,
zoo_url = zoo_url
)
print("Press 'x' or 'q' to stop.")
# show results of inference
with degirum_tools.Display("AI Camera") as output_display:
for inference_result in degirum_tools.predict_stream(model, video_source):
output_display.show(inference_result)
Setup:
The RTSP server is a Raspberry Pi Zero 2 W which is connected to the Raspberry Pi Camera 3 module.
The RTSP stream was validated through VLC Media player over laptop running windows and it was okay.
The Above code is running on Raspberry Pi 5 with the AI Hat containing Hailo8l chip.
NOTE: The above mentioned Raspberry Pi Zero 2W, Raspberry Pi5 and the laptop are all connected to the same WiFi, hence in the same LAN. All three devices are present in the same network segment.
Observations:
While the video output in the VLC media player was smooth, there was issues while visualising the output through output_display.
It looked like the video stream was not getting completely rendered.
However, when I stand for few seconds near the camera and wait for the video/frame to get stabilised, then the face/head is detected and the rectangle is visible, however the entire image is never completely rendered.
Would you be kind to please let me know the reason for the above errors and the corresponding corrections that I must make from my side in order to perform the inference correctly ?
Check if RTSP stream renders smoothly without any inference: you can use the code below for this:
import degirum as dg, degirum_tools
video_source="<your rtsp url>"
# show video
with degirum_tools.Display("Camera") as display:
with degirum_tools.open_video_stream(video_source) as video_stream:
for frame in degirum_tools.video_source(video_stream):
display.show(frame)
Check if you can decrease the FPS and/or resolution of your RTSP stream as video decoding and resizing can take up a lot of compute.
Hello @shashi ,
I highly appreciate the guidance provided, thank you.
I followed your instructions and I observed that the behaviour did not change in any way. In order to reduce the computation, I even changed the code to just print. The errors started appearing even before the print statement was executed,
I have reduced the fps of the video to 15 (original was 30), however no change. I will try and change the resolution and lower further the fps in order to see if that helps.
It looks like the reference link you provided has a good explanation for it. I will go through it more deeply and try to work on it.
I tried the above code snipped using a USB Webcam, and it was definitely very smooth. Could process around 24fps , there were no errors.
However, when the RTSP source was used, it still contained errors and the rendering was still not okay, which effected the object detection.
Based on the above results I think I will try to work using a WebCam instead of RTSP, but will also try to understand why the RTSP simply won’t work even though the resolution was decreased and fps was decreased to 5fps.
As always, I appreciate the help and guidance provided.
Thank you very much.