论文标题
具有可区分增强层的深层不变网络
Deep invariant networks with differentiable augmentation layers
论文作者
论文摘要
设计对某些数据转换不变的学习系统对于机器学习至关重要。从业人员通常可以通过选择网络体系结构(例如使用卷积进行翻译或使用数据增强。但是,在网络中实现真正的不变性可能很困难,并且并不总是知道数据不变。学习数据增强策略的最新方法需要持有数据,并且基于双重优化问题,这些问题很复杂,可以解决,并且通常在计算上要求。在这项工作中,我们仅从培训数据中研究了学习不断发展的新方法。使用直接在网络中构建的可学习的增强层,我们证明我们的方法非常通用。它可以结合任何类型的可区分扩展,并应用于计算机视觉之外的广泛学习问题。我们提供的经验证据表明,基于二线优化的现代自动数据增强技术比现代自动数据增强技术更容易,更快,同时取得了可比的结果。实验表明,虽然通过自动数据增强转移到模型的模型受到模型表达性的限制,但我们的方法所产生的不变性对设计不敏感。
Designing learning systems which are invariant to certain data transformations is critical in machine learning. Practitioners can typically enforce a desired invariance on the trained model through the choice of a network architecture, e.g. using convolutions for translations, or using data augmentation. Yet, enforcing true invariance in the network can be difficult, and data invariances are not always known a piori. State-of-the-art methods for learning data augmentation policies require held-out data and are based on bilevel optimization problems, which are complex to solve and often computationally demanding. In this work we investigate new ways of learning invariances only from the training data. Using learnable augmentation layers built directly in the network, we demonstrate that our method is very versatile. It can incorporate any type of differentiable augmentation and be applied to a broad class of learning problems beyond computer vision. We provide empirical evidence showing that our approach is easier and faster to train than modern automatic data augmentation techniques based on bilevel optimization, while achieving comparable results. Experiments show that while the invariances transferred to a model through automatic data augmentation are limited by the model expressivity, the invariance yielded by our approach is insensitive to it by design.