论文标题
在非线性干草堆中发现了稀疏的神经网络的针头
What needles do sparse neural networks find in nonlinear haystacks
论文作者
论文摘要
在人工神经网络(ANN)中使用稀疏性诱导惩罚避免过度拟合,尤其是在噪声较高并且与功能数量相比较小的情况下,训练集很小。对于线性模型,这种方法也证明还可以恢复重要的特征,具有精心选择的惩罚参数的模式可能性很高。设置惩罚参数的典型方法是将数据集分解并执行交叉验证,这是(1)计算上昂贵的,并且(2)当数据集已经很小而无法进一步拆分时(例如,全基因组序列数据)。在这项研究中,我们建立了理论基础,以在零范围假设下,基于边界的界限,选择损失函数梯度的无限规范而无需交叉验证参数。我们的方法是对Donoho和Johnstone(1994)的普遍阈值对非线性ANN学习的概括。我们在一个简单的模型上执行一组综合的蒙特卡洛模拟,数值结果显示了所提出的方法的有效性。
Using a sparsity inducing penalty in artificial neural networks (ANNs) avoids over-fitting, especially in situations where noise is high and the training set is small in comparison to the number of features. For linear models, such an approach provably also recovers the important features with high probability in regimes for a well-chosen penalty parameter. The typical way of setting the penalty parameter is by splitting the data set and performing the cross-validation, which is (1) computationally expensive and (2) not desirable when the data set is already small to be further split (for example, whole-genome sequence data). In this study, we establish the theoretical foundation to select the penalty parameter without cross-validation based on bounding with a high probability the infinite norm of the gradient of the loss function at zero under the zero-feature assumption. Our approach is a generalization of the universal threshold of Donoho and Johnstone (1994) to nonlinear ANN learning. We perform a set of comprehensive Monte Carlo simulations on a simple model, and the numerical results show the effectiveness of the proposed approach.