论文标题

对抗性强大的拓扑推断

Adversarially Robust Topological Inference

论文作者

Vishwanath, Siddharth, Sriperumbudur, Bharath K., Fukumizu, Kenji, Kuriki, Satoshi

论文摘要

紧凑型集合的距离函数在拓扑数据分析的范式中起着至关重要的作用。特别是,距离函数的级别集用于持续同源性的计算 - 拓扑数据分析管道的骨干。尽管它稳定在Hausdorff距离处扰动,但持续的同源性对异常值高度敏感。在这项工作中,我们在存在异常值的情况下开发了持续同源性统计学推断的框架。从鲁棒统计数据中的最新发展中汲取灵感,我们提出了距离函数(\ textsf {MOM DIST})的\ textit {MEANS的中间}变体并建立了其统计属性。特别是,我们表明,即使在存在异常值的情况下,\ textsf {MOM DIST}引起的级别过滤和加权过滤都是真正基础种群对应物的一致估计量,并且在对抗性环境中的最小值 - 最佳性能附近展示。最后,我们通过模拟和应用来证明所提出的方法的优势。

The distance function to a compact set plays a crucial role in the paradigm of topological data analysis. In particular, the sublevel sets of the distance function are used in the computation of persistent homology -- a backbone of the topological data analysis pipeline. Despite its stability to perturbations in the Hausdorff distance, persistent homology is highly sensitive to outliers. In this work, we develop a framework of statistical inference for persistent homology in the presence of outliers. Drawing inspiration from recent developments in robust statistics, we propose a \textit{median-of-means} variant of the distance function (\textsf{MoM Dist}) and establish its statistical properties. In particular, we show that, even in the presence of outliers, the sublevel filtrations and weighted filtrations induced by \textsf{MoM Dist} are both consistent estimators of the true underlying population counterpart and exhibit near minimax-optimal performance in adversarial settings. Finally, we demonstrate the advantages of the proposed methodology through simulations and applications.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源