掌握具有高效率超级陷阱图像的大型多标签图像识别

论文标题

掌握具有高效率超级陷阱图像的大型多标签图像识别

Mastering Large Scale Multi-label Image Recognition with high efficiency overCamera trap images

论文作者

Valan, Miroslav, Picek, Lukáš

论文摘要

相机陷阱在生物多样性的动机研究中至关重要，但是在注释这些数据集的同时处理大量图像是一项繁琐且耗时的任务。为了加快此过程，机器学习方法是合理的资产。在本文中，我们提出了一种基于我们对“ Hakuna Ma-data-Serengeti Wildlife识别挑战”的胜利提交，提出了一种简单，易于访问，轻巧，快速和高效的方法。我们的系统的准确性为97％，表现优于人类水平的表现。我们表明，鉴于相对较大的数据集，只能使用几乎没有或没有增强的情况只能查看每张图像一次。通过使用如此简单但有效的基线，我们能够避免过度安装而没有广泛的正则化技术，并在具有较大的训练集（670万张图像和6TB）的情况下，在非常有限的硬件上训练最有限的硬件上的最高得分系统。

Camera traps are crucial in biodiversity motivated studies, however dealing with large number of images while annotating these data sets is a tedious and time consuming task. To speed up this process, Machine Learning approaches are a reasonable asset. In this article we are proposing an easy, accessible, light-weight, fast and efficient approach based on our winning submission to the "Hakuna Ma-data - Serengeti Wildlife Identification challenge". Our system achieved an Accuracy of 97% and outperformed the human level performance. We show that, given relatively large data sets, it is effective to look at each image only once with little or no augmentation. By utilizing such a simple, yet effective baseline we were able to avoid over-fitting without extensive regularization techniques and to train a top scoring system on a very limited hardware featuring single GPU (1080Ti) despite the large training set (6.7M images and 6TB).

下载PDF全文

下载文献需遵守相关版权规定

论文标题