神经崩溃：关于建模原理和概括的综述

论文标题

神经崩溃：关于建模原理和概括的综述

Neural Collapse: A Review on Modelling Principles and Generalization

论文作者

Kothapalli, Vignesh

论文摘要

当训练误差达到零并倾向于表现出有趣的神经塌陷（NC）特性时，深层分类器神经网络进入训练的终端阶段（TPT）。神经塌陷本质上代表了一种状态，即最终隐藏层输出的类内变异性在无限程度上很小，其类别意味着形成单纯含量的紧密框架。这将最后一层的行为简化为最近级中心决策规则的行为。尽管这种状态很简单，但达到它的动态和含义尚未完全理解。在这项工作中，我们回顾了有助于建模神经崩溃的原则，然后是该状态对神经网络的概括和转移学习能力的影响。最后，我们通过讨论潜在的途径和未来研究的方向来得出结论。

Deep classifier neural networks enter the terminal phase of training (TPT) when training error reaches zero and tend to exhibit intriguing Neural Collapse (NC) properties. Neural collapse essentially represents a state at which the within-class variability of final hidden layer outputs is infinitesimally small and their class means form a simplex equiangular tight frame. This simplifies the last layer behaviour to that of a nearest-class center decision rule. Despite the simplicity of this state, the dynamics and implications of reaching it are yet to be fully understood. In this work, we review the principles which aid in modelling neural collapse, followed by the implications of this state on generalization and transfer learning capabilities of neural networks. Finally, we conclude by discussing potential avenues and directions for future research.

下载PDF全文

下载文献需遵守相关版权规定

论文标题