论文标题
非常深的VAE概括自回归模型,并且可以在图像上胜过它们
Very Deep VAEs Generalize Autoregressive Models and Can Outperform Them on Images
论文作者
论文摘要
我们提出了一个层次的VAE,该层次是第一次生成样品,同时在所有自然图像基准上都超过log-okelihoods的PixelCnn。我们首先要观察到,从理论上讲,VAE实际上可以代表自回归模型,并且如果存在足够深,则可以更快地,更快的模型。尽管如此,自记录模型在历史上表现出色的vaes。我们测试深度不足是否通过将VAE缩放到比以前探索的更大的随机深度来解释为什么并评估CIFAR-10,ImageNet和FFHQ。与PixelCNN相比,这些非常深的VAE具有更高的可能性,使用较少的参数,生成样品更快的速度,并且更容易地应用于高分辨率图像。定性研究表明,这是因为VAE学习有效的层次视觉表示。我们在https://github.com/openai/vdvae上发布源代码和模型。
We present a hierarchical VAE that, for the first time, generates samples quickly while outperforming the PixelCNN in log-likelihood on all natural image benchmarks. We begin by observing that, in theory, VAEs can actually represent autoregressive models, as well as faster, better models if they exist, when made sufficiently deep. Despite this, autoregressive models have historically outperformed VAEs in log-likelihood. We test if insufficient depth explains why by scaling a VAE to greater stochastic depth than previously explored and evaluating it CIFAR-10, ImageNet, and FFHQ. In comparison to the PixelCNN, these very deep VAEs achieve higher likelihoods, use fewer parameters, generate samples thousands of times faster, and are more easily applied to high-resolution images. Qualitative studies suggest this is because the VAE learns efficient hierarchical visual representations. We release our source code and models at https://github.com/openai/vdvae.