DeGirum YOLOv11s Object detection evaluation: Accuracy metrics decreased compared to Hailo-model-zoo result

Hi,

I’d like to share some notes about benchmarking object detection with Yolov11s.hef on HAILO8:

Setup:

Hardware: UP Ai edge board, using Hailo8 accelerator.

Benchmark model: YOLOv11s.hef from Hailo-model-zoo github repository. Pretrained, 80 classes, default release by Hailo.

YOLO model download Path: link.

Dataset: COCO-2017-val (5000 images)

Libraries used: From DeGrium evaluation guide: Hailo guide: Evaluating model accuracy after compilation .

Library versions:
Ubuntu OS 22.04 Jammy

Python: 3.10.19
HailoRT: 4.20.0

Hailo DFC: v3.31.0

Hailo Model Zoo: 2.16.0

DeGirum Tools: 0.22.4.

Results after running evaluation:

FPS: 77 to hailo-model-zoo’s 111 FPS. (Used model_time_profile module from DeGirum Tools)

mAp 50-95: 38% to hailo-model-zoo’s mAp 45.2%.

FPS dropped about 30% compared to the value stated in Hailo model zoo. I investigated and found PCIE Gen3 only running 2 our of 4 lanes available, this might be part of the reason why the FPS slowed down.

Accuracy dropped while it should be approximately the same as stated by Hailo model zoo.

Question 1: Is there possibly any other factors that might cause FPS slowing down during inference

Question 2: What are the possible causes for accuracy decrease when evaluating using DeGirum Tools on the yolov11s.HEF model?

Please help! Thank you.

Hi @LongVu-Tr

Welcome to DeGirum community.

Just to be sure: the hef file is from hailo model zoo or did you compile another yolo11s by yourself? Can you please share how you obtained the 111FPS and 45.2% mAP? Are these just numbers from the hailo model zoo github page?

Hi @LongVu-Tr

We ran the evaluation of yolo11s and obtained the following results:

yolo11s_coco--640x640_quant_hailort_hailo8_1
[array([0.45358596, 0.62901933, 0.48389093, 0.2788012 , 0.4938611 ,
       0.624764  , 0.35266808, 0.57291141, 0.61078248, 0.43059618,
       0.66046639, 0.774312  ])]

So the mAP of 45.35% is very close to the reported 45.2%. If you share the exact script you used to evaluate the model (along with model JSON), we can see if there is any discrepancy.

Regarding FPS: It is host dependent. if you believe PySDK is slower than Hailo benchmark, please run hailortclicommand on the hef file. On our system, we get the following results:
degirum:0.19.0 Observed FPS=83.7

With hailortcli, we get the below numbers:

FPS     (hw_only)               = 83.55
        (streaming)               = 84.4837
Latency (hw)                     = 10.1392 ms
Device 0000:03:00.0:
Power in streaming mode (average) = 1.97945 W
                                            (max)     = 1.99689 W