论文标题
在十字路口学习积极学习?评估和讨论
Learning active learning at the crossroads? evaluation and discussion
论文作者
论文摘要
主动学习旨在通过预测哪些样本对人类专家的标签有用来降低注释成本。尽管该领域很古老,但是在现实世界中使用积极学习的一些重要挑战仍然未解决。特别是,大多数选择策略都是手工设计的,而且很明显,没有最好的积极学习策略在所有应用程序中都持续优于所有其他学习策略。这激发了对“学习如何积极学习”的元学习算法的研究。在本文中,我们将这种方法与随机森林的关联与边缘采样策略的关联进行了比较,在最近的比较研究中报道了一种非常有竞争力的启发式方法。为此,我们介绍了在20个数据集上执行的基准测试的结果,该基准比较了使用最新的元学习算法和边距采样的策略进行比较。我们还介绍了一些经验教训,并开放了未来的观点。
Active learning aims to reduce annotation cost by predicting which samples are useful for a human expert to label. Although this field is quite old, several important challenges to using active learning in real-world settings still remain unsolved. In particular, most selection strategies are hand-designed, and it has become clear that there is no best active learning strategy that consistently outperforms all others in all applications. This has motivated research into meta-learning algorithms for "learning how to actively learn". In this paper, we compare this kind of approach with the association of a Random Forest with the margin sampling strategy, reported in recent comparative studies as a very competitive heuristic. To this end, we present the results of a benchmark performed on 20 datasets that compares a strategy learned using a recent meta-learning algorithm with margin sampling. We also present some lessons learned and open future perspectives.