论文标题
单调操作员平衡网络
Monotone operator equilibrium networks
论文作者
论文摘要
最近已证明,诸如深度平衡网络之类的隐式深度模型匹配或超过传统深网的性能,同时更有效地记忆力。但是,这些模型遭受了与解决方案不稳定的融合,并且缺乏保证解决方案的保证。另一方面,神经ODE是另一种隐性深度模型,确实保证了独特的解决方案的存在,但与传统网络相比的性能很差。在本文中,我们基于单调操作员的理论,即单调操作员平衡网络(Mondeq),开发了新的隐式深度模型。我们展示了找到隐式网络的平衡点和解决单调操作员分裂问题的平衡点之间的紧密联系,该单调操作员分裂问题接受有效的求解器,并具有保证,稳定的收敛性。然后,我们开发网络的参数化,以确保所有运算符保持单调,以确保存在独特的平衡点。最后,我们向结构化的线性操作员(例如多尺度卷积)展示了如何实例化这些模型的几个版本,并实现所得的迭代求解器。最终的模型大大优于基于神经oder的模型,同时也更有效地计算。代码可在http://github.com/locuslab/monotone_op_net上找到。
Implicit-depth models such as Deep Equilibrium Networks have recently been shown to match or exceed the performance of traditional deep networks while being much more memory efficient. However, these models suffer from unstable convergence to a solution and lack guarantees that a solution exists. On the other hand, Neural ODEs, another class of implicit-depth models, do guarantee existence of a unique solution but perform poorly compared with traditional networks. In this paper, we develop a new class of implicit-depth model based on the theory of monotone operators, the Monotone Operator Equilibrium Network (monDEQ). We show the close connection between finding the equilibrium point of an implicit network and solving a form of monotone operator splitting problem, which admits efficient solvers with guaranteed, stable convergence. We then develop a parameterization of the network which ensures that all operators remain monotone, which guarantees the existence of a unique equilibrium point. Finally, we show how to instantiate several versions of these models, and implement the resulting iterative solvers, for structured linear operators such as multi-scale convolutions. The resulting models vastly outperform the Neural ODE-based models while also being more computationally efficient. Code is available at http://github.com/locuslab/monotone_op_net.