I am able to use compound models like CroppingAndDetectingCompoundModel but I’m frustrated by the results because when I use add_model1_results=True each result in the result.results is a separate array element and so for me to determine which “faces” go with which “person” I have to manually match up which bbox is inside which other result.
For cropping compound models it would be wonderful to have results returned like this such that each result is an array within the larger results array so that we don’t have to manually determine which result are within the others:
Unfortunately, we cannot change the format of result.results - it should be a list of dictionaries having standard keys like “score”, “bbox”, “label” etc., otherwise result.image_overlay will break. But we can add any additional keys to each result.results[] dictionary.
For example, we can add a key “crop_index” to every result of model2. This key will contain the index of model1 result within result.results, so you can access it this way:
for res in result.results:
if res["label"] == "face": # model2 result
person = result.results[res["crop_index"]] # corresponding model1 result
That’s a great alternative option. Just some way to link the two results together so we don’t have to manually compute if the bbox coordinates are within each other in order to link the two results. Feel free to think through it so you come to the best solution and I’ll look for the update in the next version. Love your work so far so thank you!
Looks like @vladk has a great solution but figured I would pass this along just for information in case it helps your team understand the need.
pose_det_model_name = "yolo11m_coco_pose--640x640_quant_hailort_multidevice_1"
face_det_model_name = "yolov8n_relu6_widerface_kpts--640x640_quant_hailort_hailo8l_1"
your_host_address = "@local" # Can be dg.CLOUD, host:port, or dg.LOCAL
your_model_zoo = "SECRET_PATH/hailo_examples/models"
your_token = ""
device_type = ['HAILORT/HAILO8L']
# Load the model
PoseModel = dg.load_model(
model_name = pose_det_model_name,
inference_host_address = your_host_address,
zoo_url = your_model_zoo,
token = your_token
# optional parameters, such as overlay_show_probabilities = True
)
# Face Model
FaceModel = dg.load_model(
model_name=face_det_model_name,
inference_host_address=your_host_address,
zoo_url=your_model_zoo,
token=your_token,
device_type=device_type,
overlay_color=[(255,255,0),(0,255,0)]
)
# Create a compound cropping model with combined results
crop_model = degirum_tools.CroppingAndDetectingCompoundModel(
PoseModel,
FaceModel,
add_model1_results=True
)
In the final results I get a list of dictionaries but the only way to link a “face” to the corresponding “person” is for me to create a temporary data structure where I then calculate if the “face” bbox is within a “person” bbox. But since the compound model is a Cropping model it would make sense for the compounding model to return results in a way that keeps track of the corresponding models somehow. It looks like the “crop_index” would be a good solution as discussed below.
Now, when add_model1_results=True the result.results[i] of the second model will have “crop_index” key containing the index of model1 result used for crop.
Also, model1 results are placed at the beginning of result.results list followed by model2 results.
Amazing! Already have the code implemented. Saved a ton of work. Appreciate the super communication and quick updates. I will share more about my project once we go live.