我需要几张图像？了解每类样本量如何影响深度学习模型的绩效指标

论文标题

我需要几张图像？了解每类样本量如何影响深度学习模型的绩效指标

How many images do I need? Understanding how sample size per class affects deep learning model performance metrics for balanced designs in autonomous wildlife monitoring

论文作者

Shahinfar, Saleh, Meek, Paul, Falzon, Greg

论文摘要

深度学习（DL）算法是野生动物相机陷阱图像的自动分类中的最先进的状态。面临的挑战是，生态学家不知道每个物种需要收集多少个用于模型训练的图像，以达到所需的分类准确性。实际上，在摄像机捕获的背景下，经验证据有限，以证明增加样本量将导致准确性提高。在这项研究中，我们深入探讨了深度学习模型绩效的问题，以逐步增加（物种）样本量。我们还为生态学家提供了一个近似公式，以估计先验一定准确性水平所需的每个动物物种的图像。这将有助于生态学家最佳分配资源，工作和有效的研究设计。为了研究训练图像数量的影响；设计了七个带有10、20、50、150、500、1000次图像的训练集。六个深度学习体系结构，即RESNET-18，RESNET-50，RESNET-152，DNSNET-121，DNSNET-161和DNSNET-2010，经过培训和测试。在澳大利亚，非洲和北美的三个类似数据集上重复了整个实验，并比较了结果。提供了简单的回归方程，以供从业人员使用以近似模型性能指标。总体化添加剂模型（GAM）在基于每个类，调整方案和数据集的训练图像数量的数量中有效地对DL性能指标进行建模。钥匙词：相机陷阱，深度学习，生态信息学，广义添加剂模型，学习曲线，预测建模，野生动植物。

Deep learning (DL) algorithms are the state of the art in automated classification of wildlife camera trap images. The challenge is that the ecologist cannot know in advance how many images per species they need to collect for model training in order to achieve their desired classification accuracy. In fact there is limited empirical evidence in the context of camera trapping to demonstrate that increasing sample size will lead to improved accuracy. In this study we explore in depth the issues of deep learning model performance for progressively increasing per class (species) sample sizes. We also provide ecologists with an approximation formula to estimate how many images per animal species they need for certain accuracy level a priori. This will help ecologists for optimal allocation of resources, work and efficient study design. In order to investigate the effect of number of training images; seven training sets with 10, 20, 50, 150, 500, 1000 images per class were designed. Six deep learning architectures namely ResNet-18, ResNet-50, ResNet-152, DnsNet-121, DnsNet-161, and DnsNet-201 were trained and tested on a common exclusive testing set of 250 images per class. The whole experiment was repeated on three similar datasets from Australia, Africa and North America and the results were compared. Simple regression equations for use by practitioners to approximate model performance metrics are provided. Generalized additive models (GAM) are shown to be effective in modelling DL performance metrics based on the number of training images per class, tuning scheme and dataset. Key-words: Camera Traps, Deep Learning, Ecological Informatics, Generalised Additive Models, Learning Curves, Predictive Modelling, Wildlife.

下载PDF全文

下载文献需遵守相关版权规定

论文标题