Good Evening. Is there a resource where the method of quantization for ORCA when compiling an onnx classifier (e.g mobilenet_v2) is detailed? I am using the device as one of many possible deployment hardware platforms for a research project and I need to explain the quantization process for each one of them. For example: for HAILO there is information in their relevant documentation. Is there something similar for ORCA? Thank you.
ORCA uses full integer quantization from TensorFlow without per-axis quantization. You can see the details here: Post-training quantization | Google AI Edge | Google AI for Developers