论文标题
自我反射变异自动编码器
Self-Reflective Variational Autoencoder
论文作者
论文摘要
变性自动编码器(VAE)是学习概率潜在可变生成模型的有力框架。但是,关于编码器和/或先验的近似后验分布的典型假设严重限制了其推理和生成建模的能力。基于神经回归模型的变异推断尊重确切后部的条件依赖性,但这种灵活性是有代价的:这种模型在高维度中训练的昂贵,并且生产样品的速度可能很慢。在这项工作中,我们引入了一种正交解决方案,我们称之为自我反射推断。通过重新设计现有VAE体系结构的层次结构,自我反射可确保随机流保持确切的后验的分解,并以与生成模型一致的反复方式依次更新潜在代码。我们从经验上证明了将变异后部与确切的后部相匹配的明显优势 - 在二进制的MNIST,自我反射的推理上,可以实现最先进的性能,而无需诉诸复杂的,计算昂贵的组件,例如自动回收层。此外,我们设计了一种采用拟议架构的变异归一化流量,与纯粹的生成性对应物相比产生了预测益处。我们提出的修改非常笼统,并补充了现有文献。自我反射推断自然可以利用分布估计和生成建模的进步,以提高层次结构中每一层的能力。
The Variational Autoencoder (VAE) is a powerful framework for learning probabilistic latent variable generative models. However, typical assumptions on the approximate posterior distribution of the encoder and/or the prior, seriously restrict its capacity for inference and generative modeling. Variational inference based on neural autoregressive models respects the conditional dependencies of the exact posterior, but this flexibility comes at a cost: such models are expensive to train in high-dimensional regimes and can be slow to produce samples. In this work, we introduce an orthogonal solution, which we call self-reflective inference. By redesigning the hierarchical structure of existing VAE architectures, self-reflection ensures that the stochastic flow preserves the factorization of the exact posterior, sequentially updating the latent codes in a recurrent manner consistent with the generative model. We empirically demonstrate the clear advantages of matching the variational posterior to the exact posterior - on binarized MNIST, self-reflective inference achieves state-of-the art performance without resorting to complex, computationally expensive components such as autoregressive layers. Moreover, we design a variational normalizing flow that employs the proposed architecture, yielding predictive benefits compared to its purely generative counterpart. Our proposed modification is quite general and complements the existing literature; self-reflective inference can naturally leverage advances in distribution estimation and generative modeling to improve the capacity of each layer in the hierarchy.