I get about 25ms of latency between each frame when displaying or writing, which results in choppy video writing, or imshow() displays. How can I reduce the latency?
def high_frame_generator(video_source):
stream = cv2.VideoCapture(video_source)
stream.set(cv2.CAP_PROP_FOURCC, cv2.VideoWriter_fourcc(*‘MJPG’))
stream.set(cv2.CAP_PROP_FRAME_WIDTH, 4000)
stream.set(cv2.CAP_PROP_FRAME_HEIGHT, 3000)
stream.set(cv2.CAP_PROP_FPS, 15)
while True:
ret, frame = stream.read()
if not ret: break
yield frame
out.write(frame)
stream.release()
start_time = time.time()
for result in f_det_model.predict_batch(high_frame_generator(video_source)):
print(result) #overlay = result.image_overlay() if callable(result.image_overlay) else result.image_overlay #cv2.putText(overlay, f’FPS: {fps:.2f} RES: {width}x{height}', (10, 30), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 0), 2)
#cv2.imshow('Face Detection', overlay)
print(f'time took to reach: {time.time() - start_time}')
start_time = time.time()
if cv2.waitKey(1) & 0xFF == ord('q'):
break
import degirum as dg, degirum_tools
video_source = ... # assign video source, can be path to file, local camera index or RTSP URL
# load model here
# ...
with degirum_tools.Display("<Display Title>") as display, degirum_tools.open_video_stream(video_source) as video_stream:
for result in model.predict_batch(degirum_tools.video_source(video_stream)):
display.show(result)
It depends on the video source. If you want to manipulate with cv2.VideoCapture object by calling cap.set(cv2.CAP_PROP_FRAME_WIDTH, width) then the result greatly depends on the source. It sometimes works for webcams (if underlying driver supports it) but it does not work for video files - OpenCV will not do resizing for you.
You can implement resizing in your custom source by calling cv2.resize().
Please be advised that the model’s pre-processor automatically resizes input image to the model size - no need to do anything from your side.
You may access that resized image passed to the model by setting model.save_model_image = True. The resized image will be in result._model_image.
Example:
import degirum as dg, degirum_tools, cv2
model.save_model_image = True # << here
with degirum_tools.Display("AI Camera") as display, degirum_tools.open_video_stream(video_source) as video_stream:
for result in model.predict_batch(degirum_tools.video_source(video_stream)):
# if the model is trained in RGB, you need to convert it back to BGR to have normal colors:
display.show(cv2.cvtColor(result._model_image, cv2.COLOR_RGB2BGR))
C++ SDK allows to work with AI server only - no local or cloud inferences. Also it lacks PySDK pre- and post- processing functionality: you need to do resizing on your side and rendering results as well. It allows just to run inferences on AI server.