论文标题

深层残留网络的多级最小化

Multilevel Minimization for Deep Residual Networks

论文作者

Gaedke-Merzhäuser, Lisa, Kopaničáková, Alena, Krause, Rolf

论文摘要

我们提出了一个新的多级最小化框架,用于培训深残留网络(RESNETS),该框架有可能大大减少训练时间和精力。我们的框架基于动态系统的观点,该观点将重新连接作为初始值问题的离散化。然后将训练过程提出为时间依赖性的最佳控制问题,我们使用不同的时间消失参数离散化,最终生成具有不同分辨率的辅助网络的多级层次结构。然后,通过减少决议的辅助网络来增强原始重新NET的训练。根据设计,我们的框架可以方便地独立于选择在多层层次结构级别上选择的培训策略的选择。通过数值示例,我们分析了所提出方法的收敛行为,并证明了其鲁棒性。对于我们的示例,我们采用了基于多梯度的方法。与标准单级方法的比较表明,同时达到相同的验证精度,超过了第三。

We present a new multilevel minimization framework for the training of deep residual networks (ResNets), which has the potential to significantly reduce training time and effort. Our framework is based on the dynamical system's viewpoint, which formulates a ResNet as the discretization of an initial value problem. The training process is then formulated as a time-dependent optimal control problem, which we discretize using different time-discretization parameters, eventually generating multilevel-hierarchy of auxiliary networks with different resolutions. The training of the original ResNet is then enhanced by training the auxiliary networks with reduced resolutions. By design, our framework is conveniently independent of the choice of the training strategy chosen on each level of the multilevel hierarchy. By means of numerical examples, we analyze the convergence behavior of the proposed method and demonstrate its robustness. For our examples we employ a multilevel gradient-based methods. Comparisons with standard single level methods show a speedup of more than factor three while achieving the same validation accuracy.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源