表示拓扑差异：一种比较神经网络表示的方法

论文标题

表示拓扑差异：一种比较神经网络表示的方法

Representation Topology Divergence: A Method for Comparing Neural Network Representations

论文作者

Barannikov, Serguei, Trofimov, Ilya, Balabin, Nikita, Burnaev, Evgeny

论文摘要

数据表示的比较是一个复杂的多相关问题，尚未享受完整的解决方案。我们提出了一种比较两个数据表示的方法。我们介绍了表示拓扑差异（RTD），测量了两个相等大小的点云之间的多尺度拓扑差异，而点之间的一对一对应关系。数据点云被允许位于不同的环境空间中。 RTD是适用于真实机器学习数据集的少数基于TDA的实用方法之一。实验表明，所提出的RTD与数据表示相似性的直观评估一致，并且对其拓扑结构敏感。我们将RTD应用于计算机视觉和NLP域中的神经网络表示，以了解各种问题：培训动态分析，数据分配移动，转移学习，集合学习，分解评估。

Comparison of data representations is a complex multi-aspect problem that has not enjoyed a complete solution yet. We propose a method for comparing two data representations. We introduce the Representation Topology Divergence (RTD), measuring the dissimilarity in multi-scale topology between two point clouds of equal size with a one-to-one correspondence between points. The data point clouds are allowed to lie in different ambient spaces. The RTD is one of the few TDA-based practical methods applicable to real machine learning datasets. Experiments show that the proposed RTD agrees with the intuitive assessment of data representation similarity and is sensitive to its topological structure. We apply RTD to gain insights on neural networks representations in computer vision and NLP domains for various problems: training dynamics analysis, data distribution shift, transfer learning, ensemble learning, disentanglement assessment.

下载PDF全文

下载文献需遵守相关版权规定

论文标题