论文标题
培训期间最佳表示的可用信息和演变
Usable Information and Evolution of Optimal Representations During Training
论文作者
论文摘要
我们介绍了深层网络所学的表示中包含的可用信息的概念,并使用它来研究培训期间任务的最佳表示。我们表明,通过随机梯度下降训练的隐式正规化,具有较高的学习率和小批量大小,在学习最小足够的表现方面起着重要作用。在达到最少足够代表的过程中,我们发现表示形式的内容在训练过程中动态变化。特别是,我们发现语义上有意义但最终无关的信息是在训练的早期短暂动力学中编码的,然后再被丢弃。此外,我们评估培训的初始部分如何影响学习动力和结果表示。我们显示了这些对受神经科学文献启发的感知决策任务以及标准图像分类任务的影响。
We introduce a notion of usable information contained in the representation learned by a deep network, and use it to study how optimal representations for the task emerge during training. We show that the implicit regularization coming from training with Stochastic Gradient Descent with a high learning-rate and small batch size plays an important role in learning minimal sufficient representations for the task. In the process of arriving at a minimal sufficient representation, we find that the content of the representation changes dynamically during training. In particular, we find that semantically meaningful but ultimately irrelevant information is encoded in the early transient dynamics of training, before being later discarded. In addition, we evaluate how perturbing the initial part of training impacts the learning dynamics and the resulting representations. We show these effects on both perceptual decision-making tasks inspired by neuroscience literature, as well as on standard image classification tasks.