论文标题
通过积极学习找到决策边界的同源
Finding the Homology of Decision Boundaries with Active Learning
论文作者
论文摘要
准确有效地表征分类器的决策边界对于与模型选择和元学习有关的问题很重要。受拓扑数据分析的启发,使用其同源性对决策边界的表征最近已成为一种一般而有力的工具。在本文中,我们提出了一种积极的学习算法来恢复决策边界的同源性。我们的算法顺序和自适应地选择了它需要的样品。我们从理论上分析了所提出的框架,并表明我们主动学习算法的查询复杂性自然取决于基础歧管的内在复杂性。我们仅使用其各自的同源摘要来选择数据集中最佳的机器学习模型的有效性。几个标准数据集的实验显示了恢复同源性的样本复杂性的提高,并证明了模型选择框架的实际实用性。我们的算法和实验结果的源代码可在https://github.com/wayne0908/active-learning-homology上获得。
Accurately and efficiently characterizing the decision boundary of classifiers is important for problems related to model selection and meta-learning. Inspired by topological data analysis, the characterization of decision boundaries using their homology has recently emerged as a general and powerful tool. In this paper, we propose an active learning algorithm to recover the homology of decision boundaries. Our algorithm sequentially and adaptively selects which samples it requires the labels of. We theoretically analyze the proposed framework and show that the query complexity of our active learning algorithm depends naturally on the intrinsic complexity of the underlying manifold. We demonstrate the effectiveness of our framework in selecting best-performing machine learning models for datasets just using their respective homological summaries. Experiments on several standard datasets show the sample complexity improvement in recovering the homology and demonstrate the practical utility of the framework for model selection. Source code for our algorithms and experimental results is available at https://github.com/wayne0908/Active-Learning-Homology.