论文标题

通过无基质近端算法的正规化收割机的鲁棒PCA

Robust PCA via Regularized REAPER with a Matrix-Free Proximal Algorithm

论文作者

Beinert, Robert, Steidl, Gabriele

论文摘要

已知主成分分析(PCA)对异常值敏感,因此在文献中提出了各种鲁棒的PCA变体。一个名为“收割者”的最新模型旨在通过解决凸优化问题来找到主要成分。通常,必须提前确定主组件的数量,并且对具有数据尺寸的对称阳性半明确矩阵进行最小化,尽管主组件的数量大大较小。如果数据的尺寸较大,则禁止其使用,这在图像处理中通常是这种情况。 在本文中,我们提出了一个正规版的收割机,该版本通过惩罚相应的正交投影仪的核定标准来实现主组件数量的稀疏性。这具有一个优势,即仅需要对主组件数量的上限。我们的第二个贡献是一种无基质算法,以找到适合高维数据的正则收割机的最小化器。该算法将一种原始的偶尔最小化方法与厚实的兰开斯工艺结合在一起。作为一方面的结果,我们讨论了鲁棒PCA中偏见的主题。数值示例证明了我们的算法的性能。

Principal component analysis (PCA) is known to be sensitive to outliers, so that various robust PCA variants were proposed in the literature. A recent model, called REAPER, aims to find the principal components by solving a convex optimization problem. Usually the number of principal components must be determined in advance and the minimization is performed over symmetric positive semi-definite matrices having the size of the data, although the number of principal components is substantially smaller. This prohibits its use if the dimension of the data is large which is often the case in image processing. In this paper, we propose a regularized version of REAPER which enforces the sparsity of the number of principal components by penalizing the nuclear norm of the corresponding orthogonal projector. This has the advantage that only an upper bound on the number of principal components is required. Our second contribution is a matrix-free algorithm to find a minimizer of the regularized REAPER which is also suited for high dimensional data. The algorithm couples a primal-dual minimization approach with a thick-restarted Lanczos process. As a side result, we discuss the topic of the bias in robust PCA. Numerical examples demonstrate the performance of our algorithm.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源