论文标题
通过输出灵敏度对深神经网络的概括比较
Generalization Comparison of Deep Neural Networks via Output Sensitivity
论文作者
论文摘要
尽管最近的作品对最先进学习模型中使用的技术的性能提高了一些见解,但需要更多的工作才能理解其概括属性。我们通过将损耗函数与输出的敏感性联系到其输入的敏感性来阐明此问题。我们发现,输出灵敏度与损失函数的偏差方差分解方差之间存在相当牢固的经验关系,这暗示着使用灵敏度作为度量标准,用于比较网络的概括性能,而无需标记数据。我们发现,通过应用流行方法来提高模型的概括性能,例如(1)使用深网,而不是宽的网络,((2)将卷积层添加到基线分类器中,而不是添加完全连接的层,(3)使用批处理归一化,辍学和最大化和(4)应用参数初始化技术。
Although recent works have brought some insights into the performance improvement of techniques used in state-of-the-art deep-learning models, more work is needed to understand their generalization properties. We shed light on this matter by linking the loss function to the output's sensitivity to its input. We find a rather strong empirical relation between the output sensitivity and the variance in the bias-variance decomposition of the loss function, which hints on using sensitivity as a metric for comparing the generalization performance of networks, without requiring labeled data. We find that sensitivity is decreased by applying popular methods which improve the generalization performance of the model, such as (1) using a deep network rather than a wide one, (2) adding convolutional layers to baseline classifiers instead of adding fully-connected layers, (3) using batch normalization, dropout and max-pooling, and (4) applying parameter initialization techniques.