级联决策树的简洁解释

论文标题

级联决策树的简洁解释

Succinct Explanations With Cascading Decision Trees

论文作者

Zhang, Jialu, Wang, Yitan, Santolucito, Mark, Piskac, Ruzica

论文摘要

决策树是1980年代最受欢迎，最古典的机器学习模型之一。但是，在许多实际应用中，决策树倾向于产生深度过度的决策路径。较长的决策路径通常会导致过度拟合问题，并使模型难以解释。使用较长的决策路径，当数据包含缺失值时，推理也更有可能失败。在这项工作中，我们提出了一种新的树模型，称为级联决策树，以减轻此问题。级联决策树的关键见解是将决策路径和解释路径分开。我们的实验表明，平均而言，级联决策树会产生63.38％的解释路径，避免过度拟合，从而达到更高的测试准确性。我们还从经验上证明，级联的决策树在与缺失价值的鲁棒性上具有优势。

The decision tree is one of the most popular and classical machine learning models from the 1980s. However, in many practical applications, decision trees tend to generate decision paths with excessive depth. Long decision paths often cause overfitting problems, and make models difficult to interpret. With longer decision paths, inference is also more likely to fail when the data contain missing values. In this work, we propose a new tree model called Cascading Decision Trees to alleviate this problem. The key insight of Cascading Decision Trees is to separate the decision path and the explanation path. Our experiments show that on average, Cascading Decision Trees generate 63.38% shorter explanation paths, avoiding overfitting and thus achieve higher test accuracy. We also empirically demonstrate that Cascading Decision Trees have advantages in the robustness against missing values.

下载PDF全文

下载文献需遵守相关版权规定

论文标题