Hailo guide: Evaluating model accuracy after compilation

c_franklin · August 25, 2025, 6:12pm

This guide explains how to evaluate model accuracy after you’ve compiled and deployed your model in PySDK format to a Hailo accelerator (or cloud/local server).
Evaluating accuracy helps ensure your model performs as expected.

Tip: You can compile your model using the AI Hub Cloud Compiler.
If you’re new to the tool, check out the Cloud Compiler Quickstart Guide.

DeGirum PySDK provides built-in tools in degirum_tools to compute:

mAP (mean Average Precision ) for object detection, pose, and segmentation models
Top-K Accuracy for image classification models

This guide shows you how to use these utilities with clean Python snippets.

Setting up your environment

This guide assumes that you have installed PySDK, the Hailo AI runtime and driver, and DeGirum Tools.

Click here for more information about installing PySDK.
Click here for information about installing the Hailo runtime and driver.

To install degirum_tools, run:
pip install degirum_tools

General evaluation steps

Follow these general steps to evaluate your model:

Load the compiled model:

model = dg.load_model("your_model_name")

Select the appropriate evaluation class:

ObjectDetectionModelEvaluator – for object detection, pose estimation, or segmentation
ImageClassificationModelEvaluator – for image classification

Provide the necessary inputs:

A directory of images
COCO-style annotations (required for detection tasks)

Run evaluation and review results:

result = evaluator.evaluate()
print(result)

Object detection mAP evaluation example

For detection models, including Hailo-compiled YOLO variants:

import degirum as dg
import degirum_tools
from degirum_tools.detection_eval import ObjectDetectionModelEvaluator

# Load the detection model
model = dg.load_model(
    model_name="yolov8n_relu6_face--640x640_quant_hailort_multidevice_1",
    inference_host_address="@local",
    zoo_url="degirum/hailo",
    token=''
)

model.output_confidence_threshold =  0.001
model.output_nms_threshold = 0.7
model.output_max_detections = 300
model.output_max_detections_per_class = 300

# Optional class ID remapping: model → COCO
classmap = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25,
            27, 28, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 46, 47, 48, 49, 50, 51,
            52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 67, 70, 72, 73, 74, 75, 76, 77,
            78, 79, 80, 81, 82, 84, 85, 86, 87, 88, 89, 90]

# Create evaluator
evaluator = ObjectDetectionModelEvaluator(model, classmap=classmap)

# Evaluation inputs
image_dir = "/path/to/val2017/images"
coco_json = "/path/to/annotations/instances_val2017.json"

# Evaluate and return mAP results
results = evaluator.evaluate(image_dir, coco_json, max_images=0)

# Print COCO-style mAP results
print("COCO mAP stats:", results[0])

When to use a classmap

Use a classmap if your model’s category IDs differ from standard COCO category IDs or if you’re using a custom label set for evaluation against a standard COCO dataset.

Example classmap:

{
  "0": "car",
  "1": "person",
  "2": "dog"
}

This ensures accurate mapping between predicted labels and ground truth.

COCO category examples

Category	COCO ID
person	0
bicycle	1
car	2
…	…
toothbrush	87

You can find the full COCO category map in the official annotation JSON file.

Image classification Top-K accuracy evaluation example

Use this if your image directory has subfolders per class:

import degirum as dg
import degirum_tools
from degirum_tools.classification_eval import ImageClassificationModelEvaluator

# Load classification model
model = dg.load_model(
    model_name="yolov8s_imagenet--224x224_quant_hailort_hailo8_1",
    inference_host_address="@local",
    zoo_url="degirum/hailo",
    token=''
)

# Create evaluator
evaluator = ImageClassificationModelEvaluator(
    model,
    top_k=[1, 5],
    show_progress=True
)

# Folder structure should be: /images/cat/, /images/dog/, etc.
image_dir = "/path/to/classification_test"

# Run evaluation (no annotation file required)
results = evaluator.evaluate(image_dir, ground_truth_annotations_path="", max_images=0)

# Print top-k accuracy
print("Top-K Accuracies:", results[0])

Metric breakdown

Detection output (`results[0]`)

AP: Overall mean Average Precision
AP50: Precision at IoU ≥ 0.5
AP75: Precision at IoU ≥ 0.75
AP_small, AP_medium, AP_large: Size-specific precision
AR: Recall statistics

Classification output (`results[0]`)

Top-1 Accuracy: Ground truth label is the top prediction
Top-5 Accuracy: Ground truth label is among top 5 predictions

Summary

Use ObjectDetectionModelEvaluator for detection models
Use ImageClassificationModelEvaluator for classification models
For COCO-style evaluation, use a classmap if category IDs don’t match
Use the max_images parameter to limit evaluation size for faster testing

You’re ready to evaluate!

You’re now ready to accurately assess your model’s performance.

Try evaluating your own models using these steps, and share your experience or questions by replying below.

Topic		Replies	Views
Hailo guide 1: Hailo world – running your first inference on a Hailo device using DeGirum PySDK Hailo Guides hailo , pysdk	17	855	October 29, 2025
DeGirum YOLOv11s Object detection evaluation: Accuracy metrics decreased compared to Hailo-model-zoo result General hailo , degirum-tools	5	73	December 4, 2025
Hailo guide 2: Running your first object detection model on a Hailo device using DeGirum PySDK Hailo Guides hailo , pysdk	0	575	February 7, 2025
PySDK guide: PyTorch YOLOv8 checkpoint porting to DeGirum model zoo and evaluation PySDK Basics pysdk	1	160	December 16, 2023
Evaluation of YOLO-OBB Model Fails in degirum_tools General hailo , pysdk , degirum-tools	4	45	September 25, 2025