论文标题
避免使用混合正则化的随机特征模型的双重下降现象
Avoiding The Double Descent Phenomenon of Random Feature Models Using Hybrid Regularization
论文作者
论文摘要
我们证明了混合正则化方法自动避免在随机特征模型(RFM)训练中产生的双重下降现象的能力。双重下降现象的标志性特征是插值阈值处正规化间隙的尖峰,即当RFM中的特征数量等于训练样本的数量时。为了缩小这一差距,我们论文中考虑的混合方法结合了两种最常见形式的正则化形式的强度:早期停止和体重衰减。该方案不需要高参数调整,因为它会使用广义交叉验证(GCV)自动选择停止迭代和权重衰减超参数。这也避免了需要专用验证集的必要性。虽然混合方法的好处已被充分记录在适当的反问题上,但我们的工作介绍了机器学习中的第一种用例。为了揭示对正则化的需求并激励混合方法,我们执行了受图像分类启发的详细数值实验。在这些示例中,混合方案成功地避免了双重下降现象,并产生了RFM,其概括与使用测试数据最佳调节的经典正则方法可比。我们提供了MATLAB代码,用于在本文中在https://github.com/emorymlip/hybridrfm中实施数值实验。
We demonstrate the ability of hybrid regularization methods to automatically avoid the double descent phenomenon arising in the training of random feature models (RFM). The hallmark feature of the double descent phenomenon is a spike in the regularization gap at the interpolation threshold, i.e. when the number of features in the RFM equals the number of training samples. To close this gap, the hybrid method considered in our paper combines the respective strengths of the two most common forms of regularization: early stopping and weight decay. The scheme does not require hyperparameter tuning as it automatically selects the stopping iteration and weight decay hyperparameter by using generalized cross-validation (GCV). This also avoids the necessity of a dedicated validation set. While the benefits of hybrid methods have been well-documented for ill-posed inverse problems, our work presents the first use case in machine learning. To expose the need for regularization and motivate hybrid methods, we perform detailed numerical experiments inspired by image classification. In those examples, the hybrid scheme successfully avoids the double descent phenomenon and yields RFMs whose generalization is comparable with classical regularization approaches whose hyperparameters are tuned optimally using the test data. We provide our MATLAB codes for implementing the numerical experiments in this paper at https://github.com/EmoryMLIP/HybridRFM.