通过修剪硬阈值的强大高维期望最大化算法

论文标题

通过修剪硬阈值的强大高维期望最大化算法

Robust High Dimensional Expectation Maximization Algorithm via Trimmed Hard Thresholding

论文作者

Wang, Di, Guo, Xiangyu, Li, Shi, Xu, Jinhui

论文摘要

在本文中，我们研究了在高维空间中使用任意损坏的样本（{\ em I.E.具体而言，我们提出了一种称为修剪（梯度）期望最大化的方法，该方法分别为期望步骤（E-step）和最大化步骤（M-step）增加了修剪梯度的步骤和硬阈值步骤。我们表明，在一些温和的假设和适当初始化的情况下，算法是防腐败的，当损坏的样本的损失$ε$的分数$ \ tilde {o}（\ frac {1}} {\ sqrt {\ sqrt {n}}}）时，算法是防腐败的，并收敛到（接近）最佳的最佳统计速率。此外，我们将一般框架应用于三个规范模型：高斯的混合物，回归的混合物以及线性回归与缺失协变量的混合物。我们的理论得到了彻底的数值结果的支持。

In this paper, we study the problem of estimating latent variable models with arbitrarily corrupted samples in high dimensional space ({\em i.e.,} $d\gg n$) where the underlying parameter is assumed to be sparse. Specifically, we propose a method called Trimmed (Gradient) Expectation Maximization which adds a trimming gradients step and a hard thresholding step to the Expectation step (E-step) and the Maximization step (M-step), respectively. We show that under some mild assumptions and with an appropriate initialization, the algorithm is corruption-proofing and converges to the (near) optimal statistical rate geometrically when the fraction of the corrupted samples $ε$ is bounded by $ \tilde{O}(\frac{1}{\sqrt{n}})$. Moreover, we apply our general framework to three canonical models: mixture of Gaussians, mixture of regressions and linear regression with missing covariates. Our theory is supported by thorough numerical results.

下载PDF全文

下载文献需遵守相关版权规定

论文标题