论文标题
为什么光谱归一化稳定gan:分析和改进
Why Spectral Normalization Stabilizes GANs: Analysis and Improvements
论文作者
论文摘要
光谱归一化(SN)是一种广泛使用的技术,用于提高生成对抗网络(GAN)的稳定性和样品质量。但是,目前对SN为什么有效的理解有限。在这项工作中,我们表明SN控制了GAN训练的两种重要故障模式:爆炸和消失的梯度。我们的证明说明了与成功的lecun初始化的(也许是无意的)联系。这种连接有助于解释为什么最流行的SN实现gan的实现不需要超级参数调整,而更严格的SN实现的经验表现不佳。与仅控制培训开始时梯度消失的LeCun初始化不同,SN在整个培训过程中保留了此属性。在理论理解的基础上,我们提出了一种新的光谱归一化技术:双向尺度光谱归一化(BSSN),该技术结合了从以后改进到LECUN初始化的见解:Xavier初始化和Kaiming初始化。从理论上讲,我们表明BSSN比SN提供了更好的梯度控制。从经验上讲,我们证明它在几个基准数据集上的样本质量和训练稳定性都优于SN。
Spectral normalization (SN) is a widely-used technique for improving the stability and sample quality of Generative Adversarial Networks (GANs). However, there is currently limited understanding of why SN is effective. In this work, we show that SN controls two important failure modes of GAN training: exploding and vanishing gradients. Our proofs illustrate a (perhaps unintentional) connection with the successful LeCun initialization. This connection helps to explain why the most popular implementation of SN for GANs requires no hyper-parameter tuning, whereas stricter implementations of SN have poor empirical performance out-of-the-box. Unlike LeCun initialization which only controls gradient vanishing at the beginning of training, SN preserves this property throughout training. Building on this theoretical understanding, we propose a new spectral normalization technique: Bidirectional Scaled Spectral Normalization (BSSN), which incorporates insights from later improvements to LeCun initialization: Xavier initialization and Kaiming initialization. Theoretically, we show that BSSN gives better gradient control than SN. Empirically, we demonstrate that it outperforms SN in sample quality and training stability on several benchmark datasets.