Pergunta de entrevista da empresa Channel 4

How would you benchmark an object detection model acting over a vide stream? For example, a model that is meant to detect alcohol consumption scenes in a video.

Resposta da entrevista

Sigiloso

21 de set. de 2023

Looking at the frame-by-frame predictions would not work very well, primarily due to the class imbalance (most objects are not present in most frames). Visually one can plot the predicted probability that an object is present over time - vs the ground truth label, and one can quantify the overlap via PR AUC which is more robust to class imbalance than ROC AUC, and without a need to condition on a given threshold.