使用随机标签培训时，神经网络会学到什么？

论文标题

使用随机标签培训时，神经网络会学到什么？

What Do Neural Networks Learn When Trained With Random Labels?

论文作者

Maennel, Hartmut, Alabdulmohsin, Ibrahim, Tolstikhin, Ilya, Baldock, Robert J. N., Bousquet, Olivier, Gelly, Sylvain, Keysers, Daniel

论文摘要

我们研究了经过完全随机标签的自然图像数据训练的深神经网络（DNNS）。尽管它在文献中很受欢迎，该文献经常被用来研究记忆，概括和其他现象，但对于DNN在这种情况下学到的知识知之甚少。在本文中，我们在分析上显示卷积和完全连接的网络表明，当使用随机标签训练时，网络参数的主要组件与数据之间的对齐方式进行。我们通过研究在随机标记的图像数据上预先训练的神经网络，然后在具有随机或真实标签的分离数据集上进行微调来研究这种对齐效果。我们展示了这种对齐方式如何产生正转移：即使在考虑简单效果（例如重量缩放）之后，与从头开始的训练相比，在从头开始的训练速度比从头开始的训练更快地进行了随机标签训练的网络。我们分析竞争效果（例如后来的专业化）如何隐藏正转移。这些效果在CIFAR10和Imagenet上的几种网络架构中进行了研究，包括VGG16和RESNET18。

We study deep neural networks (DNNs) trained on natural image data with entirely random labels. Despite its popularity in the literature, where it is often used to study memorization, generalization, and other phenomena, little is known about what DNNs learn in this setting. In this paper, we show analytically for convolutional and fully connected networks that an alignment between the principal components of network parameters and data takes place when training with random labels. We study this alignment effect by investigating neural networks pre-trained on randomly labelled image data and subsequently fine-tuned on disjoint datasets with random or real labels. We show how this alignment produces a positive transfer: networks pre-trained with random labels train faster downstream compared to training from scratch even after accounting for simple effects, such as weight scaling. We analyze how competing effects, such as specialization at later layers, may hide the positive transfer. These effects are studied in several network architectures, including VGG16 and ResNet18, on CIFAR10 and ImageNet.

下载PDF全文

下载文献需遵守相关版权规定

论文标题