论文标题
双重辩护套索:隐藏混杂的高维推断
Doubly Debiased Lasso: High-Dimensional Inference under Hidden Confounding
论文作者
论文摘要
隐藏混杂的存在可以使观察数据的因果关系或相关关联无效。我们专注于高维线性回归设置,其中测得的协变量受隐藏混杂的影响,并提出了回归系数载体的各个组件的{\ em偶合性的lasso}估计器。我们提倡的方法同时纠正了由于估计高维参数以及由隐藏混杂引起的偏见而纠正偏差。我们建立了其渐近态性,也证明它在高斯 - 马尔科夫(Gauss-Markov)的意义上是有效的。我们方法论的有效性取决于一个密集的混杂假设,即每个混杂变量都会影响许多协变量。通过广泛的模拟研究和基因组应用来说明有限样本性能。
Inferring causal relationships or related associations from observational data can be invalidated by the existence of hidden confounding. We focus on a high-dimensional linear regression setting, where the measured covariates are affected by hidden confounding and propose the {\em Doubly Debiased Lasso} estimator for individual components of the regression coefficient vector. Our advocated method simultaneously corrects both the bias due to estimation of high-dimensional parameters as well as the bias caused by the hidden confounding. We establish its asymptotic normality and also prove that it is efficient in the Gauss-Markov sense. The validity of our methodology relies on a dense confounding assumption, i.e. that every confounding variable affects many covariates. The finite sample performance is illustrated with an extensive simulation study and a genomic application.