因果学习的插值和正则化

论文标题

因果学习的插值和正则化

Interpolation and Regularization for Causal Learning

论文作者

Vankadara, Leena Chennuru, Rendsburg, Luca, von Luxburg, Ulrike, Ghoshdastidar, Debarghya

论文摘要

我们研究了从观察数据到插值及其对应物的镜头的学习因果模型的问题 - 正则化。大量的近期理论以及经验工作表明，在高度复杂的模型类中，插值估计器可以具有良好的统计概括属性，甚至可以对统计学习是最佳的。由Janzing（2019）最近强调的统计和因果学习之间的类比，我们研究了插值估计量是否还可以学习良好的因果模型。为此，我们考虑了一个简单的线性混杂模型，并得出了高维度中的最小范围内插器和脊调节的回归剂的 *因果风险 *的精确渐近学。根据独立因果机制的原则，是因果学习中的标准假设，我们发现插值者不能是最佳的，而因果学习需要比统计学习更强的正则化。这解决了Janzing（2019年）最近的猜想。除了这一假设之外，我们还发现了更大的行为范围，可以通过 *混杂的强度 *进行精确表征。如果混杂的强度为负，则因果学习需要比统计学习较弱的正则化，插值器可能是最佳的，最佳正则化甚至可能为负。如果混杂的强度很大，则最佳正则化是无限的，并且从观察数据中学习会有害。

We study the problem of learning causal models from observational data through the lens of interpolation and its counterpart -- regularization. A large volume of recent theoretical, as well as empirical work, suggests that, in highly complex model classes, interpolating estimators can have good statistical generalization properties and can even be optimal for statistical learning. Motivated by an analogy between statistical and causal learning recently highlighted by Janzing (2019), we investigate whether interpolating estimators can also learn good causal models. To this end, we consider a simple linearly confounded model and derive precise asymptotics for the *causal risk* of the min-norm interpolator and ridge-regularized regressors in the high-dimensional regime. Under the principle of independent causal mechanisms, a standard assumption in causal learning, we find that interpolators cannot be optimal and causal learning requires stronger regularization than statistical learning. This resolves a recent conjecture in Janzing (2019). Beyond this assumption, we find a larger range of behavior that can be precisely characterized with a new measure of *confounding strength*. If the confounding strength is negative, causal learning requires weaker regularization than statistical learning, interpolators can be optimal, and the optimal regularization can even be negative. If the confounding strength is large, the optimal regularization is infinite, and learning from observational data is actively harmful.

下载PDF全文

下载文献需遵守相关版权规定

论文标题