从Imagenet到图像分类：基准上的上下文化进度

论文标题

从Imagenet到图像分类：基准上的上下文化进度

From ImageNet to Image Classification: Contextualizing Progress on Benchmarks

论文作者

Tsipras, Dimitris, Santurkar, Shibani, Engstrom, Logan, Ilyas, Andrew, Madry, Aleksander

论文摘要

以可扩展的方式构建丰富的机器学习数据集通常需要人群来源的数据收集管道。在这项工作中，我们使用人类研究来研究采用这种管道的后果，重点关注流行的Imagenet数据集。我们研究成像网创建过程中的特定设计选择如何影响所得数据集的忠诚度 - 包括引入最先进模型利用的偏见。我们的分析指出了嘈杂的数据收集管道如何导致所得基准和它作为代理的现实世界任务之间的系统错位。最后，我们的发现强调需要增加我们当前的模型培训和评估工具包，以考虑到这种未对准。为了促进进一步的研究，我们在https://github.com/madrylab/imagenetmultilabel上发布了精致的Imagenet注释。

Building rich machine learning datasets in a scalable manner often necessitates a crowd-sourced data collection pipeline. In this work, we use human studies to investigate the consequences of employing such a pipeline, focusing on the popular ImageNet dataset. We study how specific design choices in the ImageNet creation process impact the fidelity of the resulting dataset---including the introduction of biases that state-of-the-art models exploit. Our analysis pinpoints how a noisy data collection pipeline can lead to a systematic misalignment between the resulting benchmark and the real-world task it serves as a proxy for. Finally, our findings emphasize the need to augment our current model training and evaluation toolkit to take such misalignments into account. To facilitate further research, we release our refined ImageNet annotations at https://github.com/MadryLab/ImageNetMultiLabel.

下载PDF全文

下载文献需遵守相关版权规定

论文标题