论文标题

亚参数次采样引导程序的选择

Hyperparameter Selection for Subsampling Bootstraps

论文作者

Ma, Yingying, Wang, Hansheng

论文摘要

大规模的数据分析变得越来越普遍,例如BLB(Little Bootstraps袋)等亚采样方法是评估大量数据估计器质量的强大工具。但是,子采样方法的性能受到调谐参数选择的高度影响(例如,子集大小,每个子集的再样本数)。在本文中,我们开发了一种超参数选择方法,该方法可用于选择用于亚采样方法的调谐参数。具体而言,通过仔细的理论分析,我们发现了各种子采样估计器的渐近效率及其超参数之间的分析简单而优雅的关系。这导致了超参数的最佳选择。更具体地说,对于任意指定的超参数集,我们可以将其改进为一组新的超参数,而没有额外的CPU时间成本,但是可以提高所得估计器的统计效率。模拟研究和实际数据分析都证明了我们方法的优势。

Massive data analysis becomes increasingly prevalent, subsampling methods like BLB (Bag of Little Bootstraps) serves as powerful tools for assessing the quality of estimators for massive data. However, the performance of the subsampling methods are highly influenced by the selection of tuning parameters ( e.g., the subset size, number of resamples per subset ). In this article we develop a hyperparameter selection methodology, which can be used to select tuning parameters for subsampling methods. Specifically, by a careful theoretical analysis, we find an analytically simple and elegant relationship between the asymptotic efficiency of various subsampling estimators and their hyperparameters. This leads to an optimal choice of the hyperparameters. More specifically, for an arbitrarily specified hyperparameter set, we can improve it to be a new set of hyperparameters with no extra CPU time cost, but the resulting estimator's statistical efficiency can be much improved. Both simulation studies and real data analysis demonstrate the superior advantage of our method.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源