论文标题
通过修剪硬阈值的强大高维期望最大化算法
Robust High Dimensional Expectation Maximization Algorithm via Trimmed Hard Thresholding
论文作者
论文摘要
在本文中,我们研究了在高维空间中使用任意损坏的样本({\ em I.E.具体而言,我们提出了一种称为修剪(梯度)期望最大化的方法,该方法分别为期望步骤(E-step)和最大化步骤(M-step)增加了修剪梯度的步骤和硬阈值步骤。我们表明,在一些温和的假设和适当初始化的情况下,算法是防腐败的,当损坏的样本的损失$ε$的分数$ \ tilde {o}(\ frac {1}} {\ sqrt {\ sqrt {n}}})时,算法是防腐败的,并收敛到(接近)最佳的最佳统计速率。此外,我们将一般框架应用于三个规范模型:高斯的混合物,回归的混合物以及线性回归与缺失协变量的混合物。我们的理论得到了彻底的数值结果的支持。
In this paper, we study the problem of estimating latent variable models with arbitrarily corrupted samples in high dimensional space ({\em i.e.,} $d\gg n$) where the underlying parameter is assumed to be sparse. Specifically, we propose a method called Trimmed (Gradient) Expectation Maximization which adds a trimming gradients step and a hard thresholding step to the Expectation step (E-step) and the Maximization step (M-step), respectively. We show that under some mild assumptions and with an appropriate initialization, the algorithm is corruption-proofing and converges to the (near) optimal statistical rate geometrically when the fraction of the corrupted samples $ε$ is bounded by $ \tilde{O}(\frac{1}{\sqrt{n}})$. Moreover, we apply our general framework to three canonical models: mixture of Gaussians, mixture of regressions and linear regression with missing covariates. Our theory is supported by thorough numerical results.