通过稀疏可变独立性学习稳定

论文标题

通过稀疏可变独立性学习稳定

Stable Learning via Sparse Variable Independence

论文作者

Yu, Han, Cui, Peng, He, Yue, Shen, Zheyan, Lin, Yong, Xu, Renzhe, Zhang, Xingxuan

论文摘要

协变量转化的问题引起了密集的研究关注。先前的稳定学习算法采用样本重新加权方案在没有有关培训数据的明确域信息的情况下将协变量解变。但是，使用有限的样品，很难达到理想的权重，以确保完全独立性摆脱不稳定的变量。此外，由于有效样本量过高，稳定变量内的非相关可能会导致学习模型的较高差异。这些算法需要大量样本量。在本文中，有了理论上的理由，我们提出了协变量转换问题的SVI（稀疏可变独立性）。我们引入稀疏性约束，以补偿以前方法中有限样本设置下样品重新加权的不完善性。此外，我们以迭代方式有机地结合基于独立性的样本重新加权和基于稀疏性的变量选择，以避免在稳定变量内进行反相关，从而增加了有效的样本量以减轻方差膨胀。合成数据集和现实世界数据集的实验证明了SVI带来的协变量转换性能的改善。

The problem of covariate-shift generalization has attracted intensive research attention. Previous stable learning algorithms employ sample reweighting schemes to decorrelate the covariates when there is no explicit domain information about training data. However, with finite samples, it is difficult to achieve the desirable weights that ensure perfect independence to get rid of the unstable variables. Besides, decorrelating within stable variables may bring about high variance of learned models because of the over-reduced effective sample size. A tremendous sample size is required for these algorithms to work. In this paper, with theoretical justification, we propose SVI (Sparse Variable Independence) for the covariate-shift generalization problem. We introduce sparsity constraint to compensate for the imperfectness of sample reweighting under the finite-sample setting in previous methods. Furthermore, we organically combine independence-based sample reweighting and sparsity-based variable selection in an iterative way to avoid decorrelating within stable variables, increasing the effective sample size to alleviate variance inflation. Experiments on both synthetic and real-world datasets demonstrate the improvement of covariate-shift generalization performance brought by SVI.

下载PDF全文

下载文献需遵守相关版权规定

论文标题