论文标题
通过域不变表示估计分布变化下的概括
Estimating Generalization under Distribution Shifts via Domain-Invariant Representations
论文作者
论文摘要
当机器学习模型被部署在与培训分布不同的测试分布上时,它们的性能差,但高估了其性能。在这项工作中,我们旨在更好地估计模型在分销转移中的性能,而无需监督。为此,我们使用一组域 - 不变预测变量作为未知的真实目标标签的代理。由于所得风险估计的误差取决于代理模型的目标风险,因此我们研究了域不变表示的概括,并表明潜在表示的复杂性对目标风险有重大影响。从经验上讲,我们的方法(1)实现了域适应模型的自我调整,并且(2)准确估计分布移动中给定模型的目标误差。其他应用程序包括选择模型,确定早期停止和错误检测。
When machine learning models are deployed on a test distribution different from the training distribution, they can perform poorly, but overestimate their performance. In this work, we aim to better estimate a model's performance under distribution shift, without supervision. To do so, we use a set of domain-invariant predictors as a proxy for the unknown, true target labels. Since the error of the resulting risk estimate depends on the target risk of the proxy model, we study generalization of domain-invariant representations and show that the complexity of the latent representation has a significant influence on the target risk. Empirically, our approach (1) enables self-tuning of domain adaptation models, and (2) accurately estimates the target error of given models under distribution shift. Other applications include model selection, deciding early stopping and error detection.