论文标题

部分可观测时空混沌系统的无模型预测

MABSplit: Faster Forest Training Using Multi-Armed Bandits

论文作者

Tiwari, Mo, Kang, Ryan, Lee, Je-Yong, Thrun, Sebastian, Piech, Chris, Shomorony, Ilan, Zhang, Martin Jinye

论文摘要

随机森林是当今使用最广泛的机器学习模型,尤其是在需要解释性的领域。我们提出了一种算法,该算法加速了随机森林和其他基于树的学习方法的训练。我们算法的核心是一种新型的淋巴结子例程,称为mabsplit,用于在构造决策树时有效地找到分分。我们的算法从多军匪徒文献到明智地确定如何在候选分配点上分配样​​品和计算能力。我们提供了理论保证,即mabsplit在数据点数量中从线性到对数的每个节点的样本复杂性提高了样品复杂性。在某些情况下,mabsplit导致100倍更快的训练(训练时间减少99%),而不会降低概括性能。当在各种基于森林的变体中使用mabsplit,例如极端随机的森林和随机斑块时,我们表现出类似的加速。我们还显示我们的算法可以用于分类和回归任务。最后,我们表明,MabSplit在固定计算预算下的概括性能和特征重要性计算以优于现有方法。我们所有的实验结果均可通过https://github.com/thrungroup/fastforest的单行脚本重现。

Random forests are some of the most widely used machine learning models today, especially in domains that necessitate interpretability. We present an algorithm that accelerates the training of random forests and other popular tree-based learning methods. At the core of our algorithm is a novel node-splitting subroutine, dubbed MABSplit, used to efficiently find split points when constructing decision trees. Our algorithm borrows techniques from the multi-armed bandit literature to judiciously determine how to allocate samples and computational power across candidate split points. We provide theoretical guarantees that MABSplit improves the sample complexity of each node split from linear to logarithmic in the number of data points. In some settings, MABSplit leads to 100x faster training (an 99% reduction in training time) without any decrease in generalization performance. We demonstrate similar speedups when MABSplit is used across a variety of forest-based variants, such as Extremely Random Forests and Random Patches. We also show our algorithm can be used in both classification and regression tasks. Finally, we show that MABSplit outperforms existing methods in generalization performance and feature importance calculations under a fixed computational budget. All of our experimental results are reproducible via a one-line script at https://github.com/ThrunGroup/FastForest.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源