论文标题

可靠的分布式聚类,具有冗余数据分配

Reliable Distributed Clustering with Redundant Data Assignment

论文作者

Gandikota, Venkata, Mazumdar, Arya, Rawat, Ankit Singh

论文摘要

在本文中,我们提出了分布式的广义聚类算法,这些算法尽管有散乱或不可靠的机器,这些算法可以处理多个机器的大规模数据。我们提出了一种新颖的数据分配方案,即使某些机器无法以分配的本地计算结果响应,也使我们能够获取有关整个数据的全局信息。分配方案导致分布式算法具有良好的近似值保证,可用于各种聚类和降低降低问题。

In this paper, we present distributed generalized clustering algorithms that can handle large scale data across multiple machines in spite of straggling or unreliable machines. We propose a novel data assignment scheme that enables us to obtain global information about the entire data even when some machines fail to respond with the results of the assigned local computations. The assignment scheme leads to distributed algorithms with good approximation guarantees for a variety of clustering and dimensionality reduction problems.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源