理解和缓解可逆神经网络中的爆炸反相

论文标题

理解和缓解可逆神经网络中的爆炸反相

Understanding and Mitigating Exploding Inverses in Invertible Neural Networks

论文作者

Behrmann, Jens, Vicol, Paul, Wang, Kuan-Chieh, Grosse, Roger, Jacobsen, Jörn-Henrik

论文摘要

可逆的神经网络（INNS）已用于设计生成模型，实施节省内存的梯度计算并解决反问题。在这项工作中，我们表明，常用的旅馆建筑遭受爆炸式逆转，因此很容易在数字上变得不可固化。在广泛的旅馆用例中，我们揭示了失败，包括在分布（OOD）数据（OOD）数据上的变化更改公式的不适用性，用于保存记忆后退的不正确梯度，以及无法从归一化流量模型中采样。我们进一步得出了共同体系结构原子构建块的Bi-Lipschitz性能。这些对旅馆稳定性的见解随后为解决这些失败提供了前进的方向。对于局部可逆性足够的任务，例如节省内存的后退，我们提出了一个灵活，有效的正常器。对于需要全球可逆性的问题，例如将正常流量应用于OOD数据，我们显示了设计稳定的旅馆构建块的重要性。

Invertible neural networks (INNs) have been used to design generative models, implement memory-saving gradient computation, and solve inverse problems. In this work, we show that commonly-used INN architectures suffer from exploding inverses and are thus prone to becoming numerically non-invertible. Across a wide range of INN use-cases, we reveal failures including the non-applicability of the change-of-variables formula on in- and out-of-distribution (OOD) data, incorrect gradients for memory-saving backprop, and the inability to sample from normalizing flow models. We further derive bi-Lipschitz properties of atomic building blocks of common architectures. These insights into the stability of INNs then provide ways forward to remedy these failures. For tasks where local invertibility is sufficient, like memory-saving backprop, we propose a flexible and efficient regularizer. For problems where global invertibility is necessary, such as applying normalizing flows on OOD data, we show the importance of designing stable INN building blocks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题