论文标题

通过控制归一化边缘来研究概括

Investigating Generalization by Controlling Normalized Margin

论文作者

Farhang, Alexander R., Bernstein, Jeremy, Tirumala, Kushal, Liu, Yang, Yue, Yisong

论文摘要

权重规范$ \ | w \ | $和保证金$γ$通过归一化的保证金$γ/\ | w \ | $参与学习理论。由于标准神经净优化器无法控制归一化的边缘,因此很难测试该数量是否与概括有关。本文设计了一系列的实验研究,这些研究明确控制了归一化的边缘,从而解决了两个核心问题。首先:归一化的边缘是否总是对概括产生因果影响?本文发现,在归一化的边缘似乎与概括没有关系的情况下,可以与Bartlett等人的理论背道而驰。 (2017)。第二:标准化边缘是否对概括有因果影响?该论文发现是的 - 在标准培训设置中,测试性能密切跟踪归一化的边距。该论文将高斯流程模型作为对此行为的有前途的解释。

Weight norm $\|w\|$ and margin $γ$ participate in learning theory via the normalized margin $γ/\|w\|$. Since standard neural net optimizers do not control normalized margin, it is hard to test whether this quantity causally relates to generalization. This paper designs a series of experimental studies that explicitly control normalized margin and thereby tackle two central questions. First: does normalized margin always have a causal effect on generalization? The paper finds that no -- networks can be produced where normalized margin has seemingly no relationship with generalization, counter to the theory of Bartlett et al. (2017). Second: does normalized margin ever have a causal effect on generalization? The paper finds that yes -- in a standard training setup, test performance closely tracks normalized margin. The paper suggests a Gaussian process model as a promising explanation for this behavior.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源