HI everyone!
Today was a very bad day. It started because I noticed a very high CPU consumption in my test pc and once the search started I saw maaany scripts opened calling the degirum service
At first I though I was due to my lack of knowledge and because lack of locking mechanisms in the api I’d created to start the script. When the request to the api is made it popens a new process with the inference script… that was the idea…
The problem: many processes was being created at the same time
After spending all the day trying and implementing many locking systems, because all was pointing to some sort of race contition between workers, I tried everything from local locks with variables, to system wide locking using files, even I tried some system agnostic locking using redis without no luck.
I’m lying there was luck because at first when I sent the request to the api the number of processes created was over 4 to 5, after implementing the first lock_process system using vaiables the number of processes per request was down to only 2, but this is a problem because the api and te redis only detected 1 process being created.
Now was about to surrender when I did the test I should have done first: to run the script directly on the terminal without the api call and my soul dropped down to the floor:
running the script directly on the terminal creates 2 different processes.
The script I’m running is basically the ‘smart_nvr’ demo but adding arguments to be able to ‘personalize’ the inference (I use the api call to change the selected object to detect, or to change the confidence, etc)
python3 ./stream.py --input rtmp://input.server/live/livestream --output rtmp://output.server/live/livestream --model_zoo_url aiserver://home/pi/DeGirum/zoo --model_name yolov8s_coco--640x640_quant_hailort_hailo8_1 --device HAILORT/HAILO8 --confidence 0.3 --classes clock
After calling the script in the terminal ps shows two different processes:
livestream # ps -aux | grep stream
root 2100842 95.3 0.7 2721372 504864 pts/0 Sl 22:31 0:34 python3 ./stream.py --input rtmp://input.server/live/livestream --output rtmp://output.server/live/livestream --model_zoo_url aiserver://home/pi/DeGirum/zoo --model_name yolov8s_coco--640x640_quant_hailort_hailo8_1 --device HAILORT/HAILO8 --confidence 0.3 --classes clock
root 2100870 0.5 0.1 995444 100684 pts/0 Sl 22:31 0:00 python3 ./stream.py --input rtmp://input.server/live/livestream --output rtmp://output.server/live/livestream --model_zoo_url aiserver://home/pi/DeGirum/zoo --model_name yolov8s_coco--640x640_quant_hailort_hailo8_1 --device HAILORT/HAILO8 --confidence 0.3 --classes clock
root 2101117 0.0 0.0 6612 2196 pts/0 S+ 22:32 0:00 grep --color=auto stream
Main problem is when I call the api to ‘kill’ the program Is is only able to kill one of them, so after some calls, there is enough scripts running to kill the hailo device preformance at all
This is my version of the smart_nvr (it gets one rtmp or rtsp stream and forwards the inference with tracking id and red bounding boxes to another server):
import degirum as dg, degirum_tools, time
from degirum_tools import streams as dgstreams
import argparse
parser = argparse.ArgumentParser(description="Stream video with object detection.")
parser.add_argument('--input', type=str, default="rtmp://input.server/live/livestream", help='The video source URL.')
parser.add_argument('--output', type=str, default="rtmp://output.serverh/live/livestream", help='The output URL path.')
parser.add_argument('--model_name', type=str, default="yolo11s_coco--640x640_quant_hailort_hailo8_1", help='The model choosen to do the inference')
parser.add_argument('--confidence', type=float, default=0.5, help='Confidence threshold value')
parser.add_argument('--classes', type=str, default="people", help='classes label to search for')
parser.add_argument('--model_zoo_url', type=str, default="aiserver://home/pi/DeGirum/zoo", help='URL path of the model_zoo.')
parser.add_argument('--device', type=str, default="HAILORT/HAILO8", help='Neural Chip type')
args = parser.parse_args()
dg.log.DGLog.set_verbose_state('DEBUG')
hw_location="@local"
model_name = args.model_name
model_zoo_url= args.model_zoo_url
video_source = args.input
video_output= args.output
classes = set(args.classes.split(','))
device_type = args.device
confidence = args.confidence
model_manager = dg.connect(
inference_host_address=hw_location,
zoo_url = model_zoo_url
)
model = model_manager.load_model(
model_name=model_name,
device_type=device_type,
output_confidence_threshold=confidence,
input_pad_method="letterbox",
image_backend='opencv',
overlay_color=[255,0,0],
output_class_set=classes
)
anchor = degirum_tools.AnchorPoint.CENTER
# create object tracker
tracker = degirum_tools.ObjectTracker(
class_list=classes,
track_thresh=0.35,
track_buffer=100,
match_thresh=0.9999,
trail_depth=20,
anchor_point=anchor,
show_only_track_ids = True,
annotation_color = [255,0,0]
)
cam_source = dgstreams.VideoSourceGizmo(video_source)
degirum_tools.attach_analyzers(model, [tracker])
detector = dgstreams.AiSimpleGizmo(model)
streamer = dgstreams.VideoStreamerGizmo(video_output, show_ai_overlay=True)
dgstreams.Composition(cam_source >> detector >> streamer).start()
Does anyone knows why this ‘smart_nvr.py’ is spawning two processes instead only one? And how can I fix that?
