对幽灵归一化的新外观

论文标题

对幽灵归一化的新外观

A New Look at Ghost Normalization

论文作者

Dimitriou, Neofytos, Arandjelovic, Ognjen

论文摘要

分批归一化（BatchNorm）是一种有效但知之甚少的神经网络优化技术。通常认为，批处理性能的降解为较小的批量尺寸源于它必须使用较小的样本量估算层统计。但是，最近，已显示出一种明确使用较小样本量进行标准化的BatchNorm的幽灵归一化（GhostNorm），已证明在某些数据集中可以改善BatchNorm。 Our contributions are: (i) we uncover a source of regularization that is unique to GhostNorm, and not simply an extension from BatchNorm, (ii) three types of GhostNorm implementations are described, two of which employ BatchNorm as the underlying normalization technique, (iii) by visualising the loss landscape of GhostNorm, we observe that GhostNorm consistently decreases the smoothness when compared to BatchNorm, (iv) we introduce Sequential归一化（SEQNOM），并报告在CIFAR-10和CIFAR-100数据集上的最先进方法。

Batch normalization (BatchNorm) is an effective yet poorly understood technique for neural network optimization. It is often assumed that the degradation in BatchNorm performance to smaller batch sizes stems from it having to estimate layer statistics using smaller sample sizes. However, recently, Ghost normalization (GhostNorm), a variant of BatchNorm that explicitly uses smaller sample sizes for normalization, has been shown to improve upon BatchNorm in some datasets. Our contributions are: (i) we uncover a source of regularization that is unique to GhostNorm, and not simply an extension from BatchNorm, (ii) three types of GhostNorm implementations are described, two of which employ BatchNorm as the underlying normalization technique, (iii) by visualising the loss landscape of GhostNorm, we observe that GhostNorm consistently decreases the smoothness when compared to BatchNorm, (iv) we introduce Sequential Normalization (SeqNorm), and report superior performance over state-of-the-art methodologies on both CIFAR--10 and CIFAR--100 datasets.

下载PDF全文

下载文献需遵守相关版权规定

论文标题