连续的深度平衡模型：通过将它们整合到无穷大的训练神经ODES

论文标题

连续的深度平衡模型：通过将它们整合到无穷大的训练神经ODES

Continuous Deep Equilibrium Models: Training Neural ODEs faster by integrating them to Infinity

论文作者

Pal, Avik, Edelman, Alan, Rackauckas, Christopher

论文摘要

隐式模型将图层的定义与其解决方案过程的描述分开。虽然隐性层允许诸如深度之类的特征自动适应新的方案和输入，但这种适应性使其计算费用具有挑战性。在本手稿中，我们通过根据无限的时间神经ode重新定义方法来增加DEQ的“隐性”，从而将训练成本降低了标准神经ode的训练成本2-4x。此外，我们解决了一个问题：是否有一种方法可以同时实现隐式层的鲁棒性，同时允许降低显式层的计算费用？为了解决这个问题，我们开发跳过和跳过。 DEQ是一个隐式解释（IMEX）层，同时训练明确的预测，然后进行隐式校正。我们表明，训练此明确的预测指标是免费的，甚至将训练时间减少1.11-3.19倍。该手稿一起显示了隐式和显式深度学习的二分法如何结合两种技术的优势。

Implicit models separate the definition of a layer from the description of its solution process. While implicit layers allow features such as depth to adapt to new scenarios and inputs automatically, this adaptivity makes its computational expense challenging to predict. In this manuscript, we increase the "implicitness" of the DEQ by redefining the method in terms of an infinite time neural ODE, which paradoxically decreases the training cost over a standard neural ODE by 2-4x. Additionally, we address the question: is there a way to simultaneously achieve the robustness of implicit layers while allowing the reduced computational expense of an explicit layer? To solve this, we develop Skip and Skip Reg. DEQ, an implicit-explicit (IMEX) layer that simultaneously trains an explicit prediction followed by an implicit correction. We show that training this explicit predictor is free and even decreases the training time by 1.11-3.19x. Together, this manuscript shows how bridging the dichotomy of implicit and explicit deep learning can combine the advantages of both techniques.

下载PDF全文

下载文献需遵守相关版权规定

论文标题