通过预先训练模型通过相互信息正则化的域概括

论文标题

通过预先训练模型通过相互信息正则化的域概括

Domain Generalization by Mutual-Information Regularization with Pre-trained Models

论文作者

Cha, Junbum, Lee, Kyungjae, Park, Sungrae, Chun, Sanghyuk

论文摘要

域的概括（DG）旨在仅使用有限的源域学习一个通用模型。由于训练和测试域之间的显着域移动，因此先前的DG尝试仅从源域中学习域不变表示。取而代之的是，我们使用Oracle模型使用共同信息重新构建了DG目标，该模型被推广到任何可能的域。我们通过通过预训练的模型近似oracle模型来得出可拖动的变化下限，称为使用Oracle（Miro）的互信息正则化。我们的广泛实验表明，米罗大大改善了分布的性能。此外，我们的缩放实验表明，预训练模型的尺度越大，Miro的性能提高就越大。源代码可在https://github.com/kakaobrain/miro中找到。

Domain generalization (DG) aims to learn a generalized model to an unseen target domain using only limited source domains. Previous attempts to DG fail to learn domain-invariant representations only from the source domains due to the significant domain shifts between training and test domains. Instead, we re-formulate the DG objective using mutual information with the oracle model, a model generalized to any possible domain. We derive a tractable variational lower bound via approximating the oracle model by a pre-trained model, called Mutual Information Regularization with Oracle (MIRO). Our extensive experiments show that MIRO significantly improves the out-of-distribution performance. Furthermore, our scaling experiments show that the larger the scale of the pre-trained model, the greater the performance improvement of MIRO. Source code is available at https://github.com/kakaobrain/miro.

下载PDF全文

下载文献需遵守相关版权规定

论文标题