论文标题
双重下降风险和体积饱和效果:几何视角
Double Descent Risk and Volume Saturation Effects: A Geometric Perspective
论文作者
论文摘要
双重风险现象的出现对机器学习和统计社区的兴趣越来越大,因为它挑战了U形火车测试曲线背后的众所周知的观念。通过Rissanen的最小描述长度(MDL),Balasubramanian's Occam的Razor和Amari的信息几何形状的动机,我们研究了模型量的对数:$ \ log v $的对数如何在AIC和BIC选择标准背后扩展直觉。我们发现,对于各向同性线性回归和统计晶格的特定模型类别,$ \ log v $项可以分解为不同组件的总和,每个组件都有助于解释这种现象的外观。他们特别表明为什么概括误差不一定会随着模型维度的增加而继续增长。
The appearance of the double-descent risk phenomenon has received growing interest in the machine learning and statistics community, as it challenges well-understood notions behind the U-shaped train-test curves. Motivated through Rissanen's minimum description length (MDL), Balasubramanian's Occam's Razor, and Amari's information geometry, we investigate how the logarithm of the model volume: $\log V$, works to extend intuition behind the AIC and BIC model selection criteria. We find that for the particular model classes of isotropic linear regression and statistical lattices, the $\log V$ term may be decomposed into a sum of distinct components, each of which assist in their explanations of the appearance of this phenomenon. In particular they suggest why generalization error does not necessarily continue to grow with increasing model dimensionality.