临床风险预测建模中的定向无环图和因果思维

论文标题

临床风险预测建模中的定向无环图和因果思维

Directed Acyclic Graphs and causal thinking in clinical risk prediction modeling

论文作者

Piccininni, Marco, Konigorski, Stefan, Rohmann, Jessica L, Kurth, Tobias

论文摘要

背景：在流行病学中，因果推断和预测建模方法在历史上一直与众不同。定向的无环图（DAG）用于建模先验的因果假设，并为因果问题提供了可变选择策略。尽管最初设计用于预测的工具是在因果推理中找到应用程序，但同行仍未探索。这项基于理论和仿真的研究的目的是评估在临床风险预测建模中使用DAG的潜在好处。方法和发现：我们探讨如何结合有关基本因果结构的知识可以提供有关诊断临床风险预测模型到不同环境的可运输性的见解。在因果方向上的单个指控模型可能比在反疗法方向上具有更好的运输能力。我们进一步探测是否可以使用因果知识来改善预测指标的选择。我们从经验上表明，马尔可夫毯子（Markov Blanket）是结果节点中的父母，子女和父母在DAG中的父母，是该结果的最佳预测指标。结论：我们的发现挑战了普遍接受的概念，即预测因子分布的变化如果正确包括在模型中，则预测因子的分布变化不会影响诊断临床风险预测模型校准。此外，如果存在或可以学习对基本因果结构的强大知识，则使用DAG来识别Markov毛毯变量可能是一个有用的，有效的策略，可以选择临床风险预测模型中的预测因子。

Background: In epidemiology, causal inference and prediction modeling methodologies have been historically distinct. Directed Acyclic Graphs (DAGs) are used to model a priori causal assumptions and inform variable selection strategies for causal questions. Although tools originally designed for prediction are finding applications in causal inference, the counterpart has remained largely unexplored. The aim of this theoretical and simulation-based study is to assess the potential benefit of using DAGs in clinical risk prediction modeling. Methods and Findings: We explore how incorporating knowledge about the underlying causal structure can provide insights about the transportability of diagnostic clinical risk prediction models to different settings. A single-predictor model in the causal direction is likely to have better transportability than one in the anticausal direction. We further probe whether causal knowledge can be used to improve predictor selection. We empirically show that the Markov Blanket, the set of variables including the parents, children, and parents of the children of the outcome node in a DAG, is the optimal set of predictors for that outcome. Conclusions: Our findings challenge the generally accepted notion that a change in the distribution of the predictors does not affect diagnostic clinical risk prediction model calibration if the predictors are properly included in the model. Furthermore, using DAGs to identify Markov Blanket variables may be a useful, efficient strategy to select predictors in clinical risk prediction models if strong knowledge of the underlying causal structure exists or can be learned.

下载PDF全文

下载文献需遵守相关版权规定

论文标题