HEF Model stats and analysis data

arinjay4756 · July 22, 2025, 10:09pm

Do you have any performance metrics on the compiled OBB models on the degirum model zoo? something which has details like the output of the parse-hef command using hailortcli , or stats like FPS, which host cpu was used during testing , including post processing times …IoU, bbox decoding …etc

this model below for instance -----

degirum/hailo
yolov8n_dota_obb–1024x1024_quant_hailort_multidevice_1

shashi · July 23, 2025, 12:59am

Hi @arinjay4756
Welcome to the DeGirum community. Here is the benchmark info for the requested model. Please note that the model is compiled for Hailo8l but can run on Hailo8 as well. We benchmarked it on Hailo8. If there is a specific configuration for which you want to see the benchmark, please let us know.

System Info

Component	Value
CPU	Intel Core Ultra 5 125H
OS	Ubuntu 24.04 LTS (Noble Numbat)
RAM	16GB DDR5
Accelerator	Hailo-8

Inference Benchmark (DeGirum PySDK - Local Mode)

Model Name	Postprocess Type	Observed FPS
`yolov8n_dota_obb--1024x1024_quant_hailort_multidevice_1`	DetectionYoloV8OBB	172.0

Run Info: - Iterations: 200

Elapsed Time: 1.16 sec
Inference Target: @local
Model Zoo: degirum/hailo

Server-Side Timing Breakdown (Average per Frame)

Component	Time (ms)
Python Preprocess	0.01
Core Preprocess	4.37
Core Inference	40.41
Core Postprocess	1.02
Total Frame Duration	65.49

Inference Throughput Comparison (Observed FPS)

Metric	PySDK (Batch=1)	PySDK (Batch=8)	Hailortcli (Batch=1)	Hailortcli (Batch=8)
Observed FPS	110.4	173.9	110.29	192.98

arinjay4756 · July 23, 2025, 4:31am

Thanks a lot Shashi, this is really detailed and helpful, I am surprised that so much of the preprocessing and postprocessing has been offloaded to the neural core as well, what are the outputs after the post processing in the core for a single frame? do they still need host side post processing like IoU, bbox decoding or RMS? or is that the part that has been offloaded to the neural core

shashi · July 23, 2025, 4:54am

Hi @arinjay4756
Actually not much has been offloaded to the neural core in this model. The core here in the results refers to PySDK core and not neural core of Hailo. Only the Inference part runs on Hailo accelerator and rest runs on the CPU. For YOLO models, there is no pre-processing after the input is resized to the size expected by the model (benchmarks do not measure resize overhead). The postprocessing is written by us in C++. This starts with the output tensors (bboxes, scores, angles) at different resolutions and performs bbox decoding and NMS. Since this is a powerful CPU and since the code is in C++, the overhead is negligible.

arinjay4756 · July 23, 2025, 11:41am

Okay👍 Thank you for the clarification, how exactly is the FPS calculated in the benchmarks that you have mentioned? Assuming Batch size to be 1, and if the average total frame duration is around 65ms , shouldnt that come upto around 15fps? The relevant stat for me would be how many inference results can I get per second , assuming one Inference result = 2d position and orientation of the object that I want to detect

shashi · July 23, 2025, 1:07pm

Hi @arinjay4756
The way AI accelerators work is that multiple inference jobs are in the pipeline. So, even though the latency is 65ms, the throughput is much higher as multiple jobs are submitted in a queue. As soon as frame1 finishes some layers, frame2 gets started, and so on. In Hailo systems, approximately 6 frames can be in the pipeline (our guess from extensive experimentation) and hence the FPS numbers are much higher than 1/latency. To answer your question, if there are no other bottlenecks in the systems, you should see the FPS we reported. Hope this clarifies your issue.

Topic		Replies	Views
Yolov8 FPS on Hailo8 General hailo	18	210	October 16, 2025
DeGirum YOLOv11s Object detection evaluation: Accuracy metrics decreased compared to Hailo-model-zoo result General hailo , degirum-tools	5	152	December 4, 2025
So I was trying to run a custom YOLO on Hailo-8 using DeGirum General hailo	3	286	July 10, 2025
Deploying YOLOe on Hailo-8 with deGirum SDK – Feasibility & Workflow? General hailo , pysdk	11	360	December 7, 2025
Issue in online inference with custom trained model compiled on Degirum platform General	9	106	September 19, 2025

HEF Model stats and analysis data

System Info

Inference Benchmark (DeGirum PySDK - Local Mode)

Server-Side Timing Breakdown (Average per Frame)

Inference Throughput Comparison (Observed FPS)

Related topics