论文标题

聚类重要的是:与离群值聚类的最佳近似

Clustering What Matters: Optimal Approximation for Clustering with Outliers

论文作者

Agrawal, Akanksha, Inamdar, Tanmay, Saurabh, Saket, Xue, Jie

论文摘要

与异常值聚类是计算机科学中最根本的问题之一。鉴于$ n $ n $点的$ x $ $ n $点和两个整数$ k $和$ m $,与异常值的聚类旨在将$ m $点排除在$ x $中,并将其余点分配到$ k $ clusters中,以最大程度地减少一定的成本函数。在本文中,我们提供了一种与离群值求解聚类的一般方法,这导致$ k $和$ m $的固定参数可处理(FPT)算法几乎与其无离线的无差异对应物的近似值相匹配。作为推论,我们获得了具有$ k $ -Median和$ k $ -MEANS的FPT近似算法,具有一般指标的异常值。我们还向其他问题的其他变体中展示了更多的应用,这些变体对聚类施加了其他约束,例如公平或矩阵约束。

Clustering with outliers is one of the most fundamental problems in Computer Science. Given a set $X$ of $n$ points and two integers $k$ and $m$, the clustering with outliers aims to exclude $m$ points from $X$ and partition the remaining points into $k$ clusters that minimizes a certain cost function. In this paper, we give a general approach for solving clustering with outliers, which results in a fixed-parameter tractable (FPT) algorithm in $k$ and $m$, that almost matches the approximation ratio for its outlier-free counterpart. As a corollary, we obtain FPT approximation algorithms with optimal approximation ratios for $k$-Median and $k$-Means with outliers in general metrics. We also exhibit more applications of our approach to other variants of the problem that impose additional constraints on the clustering, such as fairness or matroid constraints.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源