论文标题
Sobolev培训用于过度参数化神经网络的全球融合
Global Convergence of Sobolev Training for Overparameterized Neural Networks
论文作者
论文摘要
当训练网络以在规定的输入点集以近似目标函数的值和导数时,使用Sobolev损失。最近的工作证明了其在诸如蒸馏或合成梯度预测等各种任务中的成功应用。在这项工作中,我们证明,在输入数据的分离条件下,经过随机初始化的梯度流,经过梯度流的过度参数化的两层恢复神经网络可以符合任何给定的函数值和任何给定的定向衍生物。
Sobolev loss is used when training a network to approximate the values and derivatives of a target function at a prescribed set of input points. Recent works have demonstrated its successful applications in various tasks such as distillation or synthetic gradient prediction. In this work we prove that an overparameterized two-layer relu neural network trained on the Sobolev loss with gradient flow from random initialization can fit any given function values and any given directional derivatives, under a separation condition on the input data.