论文标题
快速准确的伪字词,稀疏的基质重新排序和增量方法
Fast and Accurate Pseudoinverse with Sparse Matrix Reordering and Incremental Approach
论文作者
论文摘要
我们如何有效,准确地计算稀疏特征矩阵的伪内ververse,以解决优化问题?伪源是矩阵逆的概括,该基质被广泛用作求解机器学习中线性系统的基本构建块。但是,由于其苛刻的时间复杂性,近似计算,更不用说精确的计算非常耗时,这将其限制在将其应用于大数据上。在本文中,我们提出了FASTPI(快速伪源),这是一种基于稀疏基质的新型增量奇异值分解(SVD)的假方法。基于观察到许多现实世界特征矩阵稀疏且高度偏斜,FastPI对重新划分并将特征矩阵分配,并逐步将低级SVD与除法组件计算出来。为了显示提出的FASTPI的功效,我们将它们应用于现实世界多标签线性回归问题。通过广泛的实验,我们证明了FASTPI比其他近似方法更快地计算伪变量,而不会丧失准确性。与基于全等级SVD的方法相比,%的内存少得多。结果表明,我们的方法有效地计算了其他现有方法在有限的时间和空间中无法处理的大且稀疏矩阵的低排名伪源。
How can we compute the pseudoinverse of a sparse feature matrix efficiently and accurately for solving optimization problems? A pseudoinverse is a generalization of a matrix inverse, which has been extensively utilized as a fundamental building block for solving linear systems in machine learning. However, an approximate computation, let alone an exact computation, of pseudoinverse is very time-consuming due to its demanding time complexity, which limits it from being applied to large data. In this paper, we propose FastPI (Fast PseudoInverse), a novel incremental singular value decomposition (SVD) based pseudoinverse method for sparse matrices. Based on the observation that many real-world feature matrices are sparse and highly skewed, FastPI reorders and divides the feature matrix and incrementally computes low-rank SVD from the divided components. To show the efficacy of proposed FastPI, we apply them in real-world multi-label linear regression problems. Through extensive experiments, we demonstrate that FastPI computes the pseudoinverse faster than other approximate methods without loss of accuracy. %and uses much less memory compared to full-rank SVD based approach. Results imply that our method efficiently computes the low-rank pseudoinverse of a large and sparse matrix that other existing methods cannot handle with limited time and space.