基于查询的硬图像检索，用于测试时间的对象检测

论文标题

基于查询的硬图像检索，用于测试时间的对象检测

Query-based Hard-Image Retrieval for Object Detection at Test Time

论文作者

Ayers, Edward, Sadeghi, Jonathan, Redford, John, Mueller, Romain, Dokania, Puneet K.

论文摘要

通过查找图像可能不满意的图像来捕获对象检测器的错误行为，这一兴趣很长。在实际应用（例如自动驾驶）中，对于表征除了简单的检测性能要求之外的潜在失败也至关重要。例如，与远处未遗漏的汽车检测相比，错过对靠近自我车辆的行人的检测通常需要更仔细的检查。基于检测不确定性的文献和常规方法中，预测测试时间的潜在失败的问题在很大程度上被忽略了，因为它们对这种错误的细粒度表征不可知。在这项工作中，我们建议将“硬”图像作为基于查询的硬图像检索任务的问题进行重新制定，其中查询是“硬度”的特定定义，并提供了一种简单而直观的方法，可以为大型查询家庭解决此任务。我们的方法完全是事后的，不需要地面真相注释，独立于检测器的选择，并且依赖于有效的蒙特卡洛估计，该估计使用简单的随机模型代替地面真相。我们通过实验表明，它可以成功地应用于各种查询中，它可以可靠地识别给定检测器的硬图像，而无需任何标记的数据。我们使用广泛使用的视网膜，更快的RCNN，Mask-RCNN和Cascade Mask-RCNN对象检测器提供了对排名和分类任务的结果。该项目的代码可从https://github.com/fiveai/hardest获得。

There is a longstanding interest in capturing the error behaviour of object detectors by finding images where their performance is likely to be unsatisfactory. In real-world applications such as autonomous driving, it is also crucial to characterise potential failures beyond simple requirements of detection performance. For example, a missed detection of a pedestrian close to an ego vehicle will generally require closer inspection than a missed detection of a car in the distance. The problem of predicting such potential failures at test time has largely been overlooked in the literature and conventional approaches based on detection uncertainty fall short in that they are agnostic to such fine-grained characterisation of errors. In this work, we propose to reformulate the problem of finding "hard" images as a query-based hard image retrieval task, where queries are specific definitions of "hardness", and offer a simple and intuitive method that can solve this task for a large family of queries. Our method is entirely post-hoc, does not require ground-truth annotations, is independent of the choice of a detector, and relies on an efficient Monte Carlo estimation that uses a simple stochastic model in place of the ground-truth. We show experimentally that it can be applied successfully to a wide variety of queries for which it can reliably identify hard images for a given detector without any labelled data. We provide results on ranking and classification tasks using the widely used RetinaNet, Faster-RCNN, Mask-RCNN, and Cascade Mask-RCNN object detectors. The code for this project is available at https://github.com/fiveai/hardest.

下载PDF全文

下载文献需遵守相关版权规定

论文标题