数据增强的尚未开发的潜力：域的概括观点

论文标题

数据增强的尚未开发的潜力：域的概括观点

Untapped Potential of Data Augmentation: A Domain Generalization Viewpoint

论文作者

Piratla, Vihari, Shankar, Shiv

论文摘要

数据增强是提高概括精度的流行预处理技巧。人们认为，通过处理与原始的增强输入，该模型可以学习一组更强大的功能，这些功能在原始和增强的对应物之间共享。但是，我们表明即使是最佳的增强技术也不是这样。在这项工作中，我们采用了基于增强方法的域概括观点。这种新的观点允许探测过度拟合和描绘途径以进行改进。我们使用最先进的增强方法的探索提供了证据，表明即使在训练过程中使用的扭曲，学习的表示形式也不是强大的。这提出了证据表明尚未开发的增强实例的潜力。

Data augmentation is a popular pre-processing trick to improve generalization accuracy. It is believed that by processing augmented inputs in tandem with the original ones, the model learns a more robust set of features which are shared between the original and augmented counterparts. However, we show that is not the case even for the best augmentation technique. In this work, we take a Domain Generalization viewpoint of augmentation based methods. This new perspective allowed for probing overfitting and delineating avenues for improvement. Our exploration with the state-of-art augmentation method provides evidence that the learned representations are not as robust even towards distortions used during training. This suggests evidence for the untapped potential of augmented examples.

下载PDF全文

下载文献需遵守相关版权规定

论文标题