论文标题

基于锚的光谱群集的一致性

Consistency of Anchor-based Spectral Clustering

论文作者

de Kergorlay, Henry-Louis, Higham, Desmond John

论文摘要

基于锚的技术降低了光谱聚类算法的计算复杂性。尽管经验测试显示出令人鼓舞的结果,但目前缺乏对锚定方法的理论支持。我们定义了一种基于锚固的算法,并表明它可以进行严格的分析,并且在实践中有效。我们在渐近环境中建立了该方法的理论一致性,在渐近环境中,从潜在的连续概率分布中对数据进行采样。特别是,我们为算法参数提供尖锐的渐近条件,这些条件可确保基于锚的方法可以通过高概率分离簇恢复,这些群集被正距离相互隔开。我们说明了算法在合成数据上的性能,并说明了如何使用理论收敛分析来告知参数量表的实际选择。我们还测试了两个大型真实数据集上算法的准确性和效率。我们发现该算法比标准光谱聚类具有明显的优势。我们还发现,它与最新的Chen和CAI的LSC方法具有竞争力(第25届AAAI人工智能会议,2011年),同时具有一致性保证的额外好处。

Anchor-based techniques reduce the computational complexity of spectral clustering algorithms. Although empirical tests have shown promising results, there is currently a lack of theoretical support for the anchoring approach. We define a specific anchor-based algorithm and show that it is amenable to rigorous analysis, as well as being effective in practice. We establish the theoretical consistency of the method in an asymptotic setting where data is sampled from an underlying continuous probability distribution. In particular, we provide sharp asymptotic conditions for the algorithm parameters which ensure that the anchor-based method can recover with high probability disjoint clusters that are mutually separated by a positive distance. We illustrate the performance of the algorithm on synthetic data and explain how the theoretical convergence analysis can be used to inform the practical choice of parameter scalings. We also test the accuracy and efficiency of the algorithm on two large scale real data sets. We find that the algorithm offers clear advantages over standard spectral clustering. We also find that it is competitive with the state-of-the-art LSC method of Chen and Cai (Twenty-Fifth AAAI Conference on Artificial Intelligence, 2011), while having the added benefit of a consistency guarantee.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源