论文标题
通过Koopman操作员理论优化神经网络
Optimizing Neural Networks via Koopman Operator Theory
论文作者
论文摘要
Koopman操作员理论是发现非线性动力学系统基础动力学的有力框架,最近被证明与神经网络训练密切相关。在这项工作中,我们采取了第一个步骤来利用此连接。由于库普曼操作员理论是一种线性理论,因此它在不断发展的网络权重和偏见中成功实施了加速培训的希望,尤其是在深网的背景下,在深层网络的背景下,优化本质上是一个非凸面问题。我们表明,Koopman操作员理论方法可以准确预测在非平凡的训练时间范围内的馈电,完全连接的深网的权重和偏见。在此窗口中,我们发现与基于梯度下降的方法(例如Adam,Adadelta,Adagrad)相比,我们的方法> 10倍> 10倍,这与我们的复杂性分析一致。最后,我们在动态系统与神经网络理论之间的令人兴奋的交集中强调了开放问题。我们重点介绍了我们的结果可以扩展到更广泛的网络和更大的培训间隔的其他方法,这将是未来工作的重点。
Koopman operator theory, a powerful framework for discovering the underlying dynamics of nonlinear dynamical systems, was recently shown to be intimately connected with neural network training. In this work, we take the first steps in making use of this connection. As Koopman operator theory is a linear theory, a successful implementation of it in evolving network weights and biases offers the promise of accelerated training, especially in the context of deep networks, where optimization is inherently a non-convex problem. We show that Koopman operator theoretic methods allow for accurate predictions of weights and biases of feedforward, fully connected deep networks over a non-trivial range of training time. During this window, we find that our approach is >10x faster than various gradient descent based methods (e.g. Adam, Adadelta, Adagrad), in line with our complexity analysis. We end by highlighting open questions in this exciting intersection between dynamical systems and neural network theory. We highlight additional methods by which our results could be expanded to broader classes of networks and larger training intervals, which shall be the focus of future work.