论文标题
采取一克神经特征,获得增强的组鲁棒性
Take One Gram of Neural Features, Get Enhanced Group Robustness
论文作者
论文摘要
用经验风险最小化(ERM)训练的机器学习模型的预测性能可以在分配变化下大大降解。在训练数据集中存在虚假相关性的存在导致经过ERM训练的模型在对不存在此类相关性的少数群体评估时表现出很高的损失。已经进行了广泛的尝试来开发改善最差的鲁棒性的方法。但是,他们需要每个培训输入的组信息,或者至少需要一个带有组标签的验证来调整其超参数,这可能是昂贵的或未知的。在本文中,我们应对在培训或验证期间没有组注释而没有组注释的情况下提高组鲁棒性的挑战。为此,我们建议根据``识别''模型提取的特征的革兰氏矩阵将训练数据集分为组,并根据这些伪组应用强大的优化。在没有可用组标签的现实背景下,我们的实验表明,我们的方法不仅可以改善对ERM的稳健性,而且还优于所有最近的基线
Predictive performance of machine learning models trained with empirical risk minimization (ERM) can degrade considerably under distribution shifts. The presence of spurious correlations in training datasets leads ERM-trained models to display high loss when evaluated on minority groups not presenting such correlations. Extensive attempts have been made to develop methods improving worst-group robustness. However, they require group information for each training input or at least, a validation set with group labels to tune their hyperparameters, which may be expensive to get or unknown a priori. In this paper, we address the challenge of improving group robustness without group annotation during training or validation. To this end, we propose to partition the training dataset into groups based on Gram matrices of features extracted by an ``identification'' model and to apply robust optimization based on these pseudo-groups. In the realistic context where no group labels are available, our experiments show that our approach not only improves group robustness over ERM but also outperforms all recent baselines