论文标题

通过对具有重尾分布的大规模数据集进行估算的极值指数

Estimating Extreme Value Index by Subsampling for Massive Datasets with Heavy-Tailed Distributions

论文作者

Li, Yongxin, Chen, Liujun, Li, Deyuan, Wang, Hansheng

论文摘要

现代统计分析通常会遇到大小和重型分布的数据集。对于具有较大尺寸的数据集,传统的估计方法几乎不可用来直接估计极值指数。为了解决该问题,我们在这里提出了一种基于亚采样的方法。具体而言,通过使用替换的简单随机次采样技术,从整个数据集中绘制了多个子样本。基于每个子样本,可以计算大约最大似然估计器。然后将结果估计器进行平均以形成更准确的估计器。在适当的规律性条件下,我们从理论上说明所提出的估计量是一致且渐近地正常的。借助估计的极值指数,我们可以始终如一地估计重尾随机变量的高级分位数和尾巴概率。提供了广泛的仿真实验,以证明我们方法的有希望的性能。为了说明目的,还提出了真实的数据分析。

Modern statistical analyses often encounter datasets with massive sizes and heavy-tailed distributions. For datasets with massive sizes, traditional estimation methods can hardly be used to estimate the extreme value index directly. To address the issue, we propose here a subsampling-based method. Specifically, multiple subsamples are drawn from the whole dataset by using the technique of simple random subsampling with replacement. Based on each subsample, an approximate maximum likelihood estimator can be computed. The resulting estimators are then averaged to form a more accurate one. Under appropriate regularity conditions, we show theoretically that the proposed estimator is consistent and asymptotically normal. With the help of the estimated extreme value index, we can estimate high-level quantiles and tail probabilities of a heavy-tailed random variable consistently. Extensive simulation experiments are provided to demonstrate the promising performance of our method. A real data analysis is also presented for illustration purpose.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源