Issues After Compiling Fine-Tuned YOLO11L Model with Degirum for Hailo-8

changjae.lee · October 28, 2025, 5:47am

I compiled my fine-tuned YOLO11L model using Degirum, but the model’s behavior has become abnormal. While the model worked correctly in a CUDA environment, after converting it into an HEF file, it started producing an excessive number of detections.

My hardware setup uses Hailo-8. During compilation, I set the input image width and height to 640, and configured the runtime and device as HAILORT and HAILO8, respectively. For calibration, I randomly selected 100 images from the training dataset.

The generated JSON file after compilation looked correct, so I expected the HEF model to run normally. However, when I ran the real-time detection module with a Raspberry Pi camera, the screen was filled with a large number of random bounding boxes.

I would greatly appreciate your technical support and guidance on this matter.

I will also attach the following files for reference:

1. An image showing the corrupted detections after converting to HEF
  
  Thank you.

khatami.mehrdad · October 28, 2025, 6:25pm

Thanks for reporting this. To help us reproduce the issue and debug quickly, could you please share:

A set of ~100 images used in your compilation pipeline since the original dataset is 1TB.
The exact sample image mentioned in your comment
Access to the Google Drive link you referenced (ensure sharing is enabled), or send the .pt model file directly

Once we have these, we’ll replicate your setup and follow up with a fix or clear workaround. Appreciate your help!

changjae.lee · October 29, 2025, 1:49am

Hello!

I’m sending you the 100 calibration images we used.
These images are part of the same dataset that was used for fine-tuning.

Please let me know if you need any additional information.

Thank you!

2025년 10월 29일 (수) 오전 3:35, Mehrdad via DeGirum Community <notifications@degirum.discoursemail.com>님이 작성:

(Attachment calib640JPEG.zip is missing)

changjae.lee · October 29, 2025, 1:53am

I was unable to send the compressed file, so I am sharing it again via a Google Drive link.

https://drive.google.com/drive/folders/1QHTsOJ-rpfY02sUuU2t5GdjTNAmho6YC?usp=drive_link

2025년 10월 29일 (수) 오전 10:49, 이창재 <changjae.lee@rovoroad.com>님이 작성:

khatami.mehrdad · October 29, 2025, 2:00am

When I click on the link, I see a ‘Request Access’ button.
Please grant access so I can download the PT file and the images.

changjae.lee · October 29, 2025, 2:00am

original image

2025년 10월 29일 (수) 오전 10:57, 이창재 <changjae.lee@rovoroad.com>님이 작성:

changjae.lee · October 29, 2025, 2:14am

I have uploaded the .pt file, 100 calibration images, and the test images to the Google Drive with granted access.

2025년 10월 29일 (수) 오전 11:11, Mehrdad via DeGirum Community <notifications@degirum.discoursemail.com>님이 작성:

khatami.mehrdad · October 29, 2025, 7:22pm

We reproduced the behavior: the model shows the same failures under OpenVINO INT8 (while FP32 works), which indicates the network is highly sensitive to quantization rather than a Hailo-specific issue. Note that OpenVINO INT8 quantization is less restrictive than Hailo’s full-INT8 flow, so sensitivity here is a strong signal.

Recommendations

Prefer a smaller YOLO variant (larger models tend to be more quantization-sensitive).
Retrain with ReLU6 activations (e.g., yolov8l_relu6) to reduce quantization loss.

OpenVINO-CPU-Quant

Screenshot 2025-10-28 at 8.00.05 PM1058×816 509 KB

OpenVINO-CPU-float

Screenshot 2025-10-28 at 7.59.44 PM1062×816 526 KB

changjae.lee · October 30, 2025, 2:04am

Hello,

So, does this mean that with the current model there’s no way to solve the issue?

Do you have any recommended model sizes? For example, would the M size work, and have you seen any cases where it was used successfully?

But one thing I’m curious about: the base YOLO11-L model quantized and ran well on Hailo-8, and its performance was good. Why is it that once we fine-tune the model, quantization doesn’t work as well?

2025년 10월 30일 (목) 오전 4:33, Mehrdad via DeGirum Community <notifications@degirum.discoursemail.com>님이 작성:

shashi · October 30, 2025, 2:14am

Hi @changjae.lee

This behavior is very dependent on the dataset and there is no easy rule to predict for what models this can happen. Do you know the size of training data for this model? And how many epochs you trained it for?

changjae.lee · October 30, 2025, 2:29am

The training dataset is a total of 500GB.
We trained the model for 3 epochs, repeated 3 times.

2025년 10월 30일 (목) 오전 11:24, Shashi via DeGirum Community <notifications@degirum.discoursemail.com>님이 작성:

shashi · October 30, 2025, 2:33am

Hi @changjae.lee

Sorry, by dataset size I meant number of images. So, you started with coco trained weights of YOLO11-L and ran 3 epochs of training? What does repeated 3 times mean?

changjae.lee · October 30, 2025, 2:49am

Sorry for the confusion. The dataset contains about 700,000 images in total.
And for training, you can consider it as having run for a total of 9 epochs.

2025년 10월 30일 (목) 오전 11:43, Shashi via DeGirum Community <notifications@degirum.discoursemail.com>님이 작성:

changjae.lee · October 30, 2025, 3:39am

To be precise,
the dataset was stored on Google Drive and I trained it in a Colab environment.
Since I couldn’t upload the entire 700GB dataset at once,
I split it into 100GB chunks for training.

So the process was: upload 100GB → train for 3 epochs → upload the next 100GB → train again … → repeat this process over the entire dataset 3 times.

Could training in this way have affected the quantization results?

2025년 10월 30일 (목) 오전 11:43, Shashi via DeGirum Community <notifications@degirum.discoursemail.com>님이 작성:

shashi · October 30, 2025, 3:56am

Hi @changjae.lee

Thanks for all the information. We will discuss internally and let you know if we have any suggestions.

changjae.lee · October 30, 2025, 4:19am

네 감사합니다!

2025년 10월 30일 (목) 오후 1:06, Shashi via DeGirum Community <notifications@degirum.discoursemail.com>님이 작성:

khatami.mehrdad · October 30, 2025, 6:49pm

Hi Changjae

This is mostly a matter of dataset size and overfitting. The Yolov8n.pt and Yolov8l.pt training hyperparameters are different on the augmentation scales so overfitting does not happen.
The yolov8l or yolo11l have more aggressive augmentation, attached.
–scale 0.9 --mixup 0.15 --copy-paste 0.3
vs
yolov8n and yolov8s augmentation
–scale 0.5 --mixup 0 --copy-paste 0

This will help with overfitting and generalization which also help with quantization sensitivity.

My suggestions are

train smaller model, S or M
train with relu6
When training larger models, use more aggressive augmentation

Hope this helps
Mehrdad

(attachments)

changjae.lee · October 31, 2025, 8:36am

Thank you! I’ll test it as you suggested and get back to you afterward.

2025년 10월 31일 (금) 오전 3:59, Mehrdad via DeGirum Community <notifications@degirum.discoursemail.com>님이 작성:

alex · December 4, 2025, 10:29pm

Hi @changjae.lee, just following up on this. Was @khatami.mehrdad’s last response able to resolve your issue or do you have any follow-up questions?

Topic		Replies	Views
So I was trying to run a custom YOLO on Hailo-8 using DeGirum General hailo	3	287	July 10, 2025
Yolo11 and HAILO performance and accurancy General hailo , pysdk	2	419	August 14, 2025
Difference between Yolo11n and Yolo11s General	14	230	December 3, 2025
Deploying YOLOe on Hailo-8 with deGirum SDK – Feasibility & Workflow? General hailo , pysdk	11	365	December 7, 2025
DeGirum YOLOv11s Object detection evaluation: Accuracy metrics decreased compared to Hailo-model-zoo result General hailo , degirum-tools	5	157	December 4, 2025

Issues After Compiling Fine-Tuned YOLO11L Model with Degirum for Hailo-8

Related topics