通过控制归一化边缘来研究概括

论文标题

通过控制归一化边缘来研究概括

Investigating Generalization by Controlling Normalized Margin

论文作者

Farhang, Alexander R., Bernstein, Jeremy, Tirumala, Kushal, Liu, Yang, Yue, Yisong

论文摘要

权重规范$ \ | w \ | $和保证金$γ$通过归一化的保证金$γ/\ | w \ | $参与学习理论。由于标准神经净优化器无法控制归一化的边缘，因此很难测试该数量是否与概括有关。本文设计了一系列的实验研究，这些研究明确控制了归一化的边缘，从而解决了两个核心问题。首先：归一化的边缘是否总是对概括产生因果影响？本文发现，在归一化的边缘似乎与概括没有关系的情况下，可以与Bartlett等人的理论背道而驰。（2017）。第二：标准化边缘是否对概括有因果影响？该论文发现是的 - 在标准培训设置中，测试性能密切跟踪归一化的边距。该论文将高斯流程模型作为对此行为的有前途的解释。

Weight norm $\|w\|$ and margin $γ$ participate in learning theory via the normalized margin $γ/\|w\|$. Since standard neural net optimizers do not control normalized margin, it is hard to test whether this quantity causally relates to generalization. This paper designs a series of experimental studies that explicitly control normalized margin and thereby tackle two central questions. First: does normalized margin always have a causal effect on generalization? The paper finds that no -- networks can be produced where normalized margin has seemingly no relationship with generalization, counter to the theory of Bartlett et al. (2017). Second: does normalized margin ever have a causal effect on generalization? The paper finds that yes -- in a standard training setup, test performance closely tracks normalized margin. The paper suggests a Gaussian process model as a promising explanation for this behavior.

下载PDF全文

下载文献需遵守相关版权规定

论文标题