论文标题
对高维线性模型的解释变化的统计推断,具有致密效应
Statistical Inference on Explained Variation in High-dimensional Linear Model with Dense Effects
论文作者
论文摘要
一组协变量对结果解释的变化的统计推断在实践中特别有意义。当协变量具有中等至高维度并且效果并不稀疏时,已经提出了几种方法来进行估计和推断。现有方法的一个主要问题是,推理程序对协变量和残差错误的正态性假设不鲁。在本文中,我们提出了一种估计方程方法,以对高维线性模型中解释变化的估计和推断。与现有方法不同,所提出的方法不依赖于推论的限制性正常假设。结果表明,所提出的估计器在合理条件下是一致且渐近地正态分布。模拟研究表明,与现有方法相比,所提出的推理程序的性能更好。所提出的方法应用于研究国家健康和营养检查调查数据集中环境污染物解释的糖emogoglobin的变化。
Statistical inference on the explained variation of an outcome by a set of covariates is of particular interest in practice. When the covariates are of moderate to high-dimension and the effects are not sparse, several approaches have been proposed for estimation and inference. One major problem with the existing approaches is that the inference procedures are not robust to the normality assumption on the covariates and the residual errors. In this paper, we propose an estimating equation approach to the estimation and inference on the explained variation in the high-dimensional linear model. Unlike the existing approaches, the proposed approach does not rely on the restrictive normality assumptions for inference. It is shown that the proposed estimator is consistent and asymptotically normally distributed under reasonable conditions. Simulation studies demonstrate better performance of the proposed inference procedure in comparison with the existing approaches. The proposed approach is applied to studying the variation of glycohemoglobin explained by environmental pollutants in a National Health and Nutrition Examination Survey data set.