论文标题
使用非参数和机器学习方法进行调查中的插图程序:经验比较
Imputation procedures in surveys using nonparametric and machine learning methods: an empirical comparison
论文作者
论文摘要
非参数和机器学习方法是获得准确预测的灵活方法。如今,具有大量预测因子和复杂结构的数据集相当普遍。因此,在存在项目无响应的情况下,非参数和机器学习程序可能会为传统的插补程序提供有用的替代方法,以推导一组估算值。在本文中,我们进行了广泛的实证研究,该研究比较了多种环境(包括高维数据集)的偏差和效率的许多插补程序。结果表明,许多机器学习程序在偏见和效率方面表现良好。
Nonparametric and machine learning methods are flexible methods for obtaining accurate predictions. Nowadays, data sets with a large number of predictors and complex structures are fairly common. In the presence of item nonresponse, nonparametric and machine learning procedures may thus provide a useful alternative to traditional imputation procedures for deriving a set of imputed values. In this paper, we conduct an extensive empirical investigation that compares a number of imputation procedures in terms of bias and efficiency in a wide variety of settings, including high-dimensional data sets. The results suggest that a number of machine learning procedures perform very well in terms of bias and efficiency.