平均随机重量的对抗训练

论文标题

平均随机重量的对抗训练

Adversarial Training with Stochastic Weight Average

论文作者

Hwang, Joong-Won, Lee, Youngwan, Oh, Sungchan, Bae, Yuseok

论文摘要

对抗性训练深度神经网络通常会遇到严重的过度拟合问题。最近，可以解释说，过度拟合是因为训练数据的样本复杂性不足以概括鲁棒性。在传统的机器学习中，减轻缺乏数据的一种方法是使用集合方法。但是，对抗性培训多个网络非常昂贵。此外，我们发现选择目标模型来生成对抗性示例存在困境。优化对合奏成员的攻击将是对合奏的次优攻击并造成协方差的转变，而攻击的合奏将削弱成员并因结合而失去好处。在本文中，我们提出了具有随机体重平均值（SWA）的对抗训练；在进行对抗训练的同时，我们在训练轨迹中汇总了时间重量状态。通过采用SWA，可以在不巨大的计算增量且不面对困境的情况下获得集合的好处。此外，我们进一步改善了SWA，足以适应对抗性训练。 CIFAR-10，CIFAR-100和SVHN的经验结果表明，我们的方法可以改善模型的鲁棒性。

Adversarial training deep neural networks often experience serious overfitting problem. Recently, it is explained that the overfitting happens because the sample complexity of training data is insufficient to generalize robustness. In traditional machine learning, one way to relieve overfitting from the lack of data is to use ensemble methods. However, adversarial training multiple networks is extremely expensive. Moreover, we found that there is a dilemma on choosing target model to generate adversarial examples. Optimizing attack to the members of ensemble will be suboptimal attack to the ensemble and incurs covariate shift, while attack to ensemble will weaken the members and lose the benefit from ensembling. In this paper, we propose adversarial training with Stochastic weight average (SWA); while performing adversarial training, we aggregate the temporal weight states in the trajectory of training. By adopting SWA, the benefit of ensemble can be gained without tremendous computational increment and without facing the dilemma. Moreover, we further improved SWA to be adequate to adversarial training. The empirical results on CIFAR-10, CIFAR-100 and SVHN show that our method can improve the robustness of models.

下载PDF全文

下载文献需遵守相关版权规定

论文标题