论文标题
深度学习的拍摄表述
A Shooting Formulation of Deep Learning
论文作者
论文摘要
连续深度神经网络可以看作是离散神经网络的深层限制,其动力学类似于普通微分方程(ODE)的离散化。尽管已经采取了重要的步骤来实现这种连续公式的优势,但大多数当前技术并不是真正的连续深度,因为它们假定\ textit {相同}层。的确,现有作品使学习连续深度神经颂歌在无限二维参数空间中带来了无数困难。为此,我们介绍了一种拍摄公式,该公式将视角从参数化网络逐层参数转移到仅通过一组初始条件描述的最佳网络的参数化。为了可伸缩性,我们提出了一种新型的粒子征参数化,该参数完全指定了连续深度神经网络的最佳权重轨迹。我们的实验表明,我们的粒子构成射击公式可以实现竞争性能,尤其是在远程预测任务上。最后,尽管当前的工作灵感来自连续深入的神经网络,但粒子征的拍摄公式也适用于离散的时间网络,并可能导致深度学习参数化研究的新肥沃领域。
Continuous-depth neural networks can be viewed as deep limits of discrete neural networks whose dynamics resemble a discretization of an ordinary differential equation (ODE). Although important steps have been taken to realize the advantages of such continuous formulations, most current techniques are not truly continuous-depth as they assume \textit{identical} layers. Indeed, existing works throw into relief the myriad difficulties presented by an infinite-dimensional parameter space in learning a continuous-depth neural ODE. To this end, we introduce a shooting formulation which shifts the perspective from parameterizing a network layer-by-layer to parameterizing over optimal networks described only by a set of initial conditions. For scalability, we propose a novel particle-ensemble parametrization which fully specifies the optimal weight trajectory of the continuous-depth neural network. Our experiments show that our particle-ensemble shooting formulation can achieve competitive performance, especially on long-range forecasting tasks. Finally, though the current work is inspired by continuous-depth neural networks, the particle-ensemble shooting formulation also applies to discrete-time networks and may lead to a new fertile area of research in deep learning parametrization.