论文标题
内存分类器:机器学习中鲁棒性的两阶段分类
Memory Classifiers: Two-stage Classification for Robustness in Machine Learning
论文作者
论文摘要
机器学习模型的性能会在数据的分布变化下大大降低。我们提出了一种新的分类方法,可以通过将数据的“高级”结构与标准分类器相结合,可以改善分配变化的鲁棒性,具体来说,我们介绍了称为内存分类器的两阶段分类器。首先,这些识别原型数据点 - 记忆 - 记忆 - 记忆 - 基于训练数据的过程。然后,在每个集群中,我们基于更精细的特征来学习局部分类器,例如,我们为内存分类器建立了概括性的范围。
The performance of machine learning models can significantly degrade under distribution shifts of the data. We propose a new method for classification which can improve robustness to distribution shifts, by combining expert knowledge about the ``high-level" structure of the data with standard classifiers. Specifically, we introduce two-stage classifiers called memory classifiers. First, these identify prototypical data points -- memories -- to cluster the training data. This step is based on features designed with expert guidance; for instance, for image data they can be extracted using digital image processing algorithms. Then, within each cluster, we learn local classifiers based on finer discriminating features, via standard models like deep neural networks. We establish generalization bounds for memory classifiers. We illustrate in experiments that they can improve generalization and robustness to distribution shifts on image datasets. We show improvements which push beyond standard data augmentation techniques.