论文标题
检测无似然推理中矛盾的摘要统计数据
Detecting conflicting summary statistics in likelihood-free inference
论文作者
论文摘要
贝叶斯无可能的方法使用模型中的数据模拟实施贝叶斯推断,以代替棘手的似然评估。在执行贝叶斯推理之前,大多数无可能的推理方法将完整的数据集替换为摘要统计量,并且此统计量的选择通常很困难。出于计算原因,摘要统计量应该是低维的,同时保留了有关参数的尽可能多的信息。使用可解释的机器学习文献中的最新想法,我们开发了一些基于回归的诊断方法,这些方法可用于检测何时摘要统计矢量的不同部分包含有关模型参数的相互矛盾的信息。这种冲突使摘要的统计选择复杂化,并且检测它们可以深入了解模型缺陷和指导模型改进。开发的诊断方法基于无可能推断的回归方法,其中回归模型使用摘要统计量作为特征估算后部密度。回归模型中摘要统计矢量部分的删除和归因可以消除冲突和近似后验分布的摘要统计子集。删除和插补后估计的后密度的比预期变化大于预期的变化,这可能表明对利益的推断受到影响。在许多真实示例中证明了新方法的有用性。
Bayesian likelihood-free methods implement Bayesian inference using simulation of data from the model to substitute for intractable likelihood evaluations. Most likelihood-free inference methods replace the full data set with a summary statistic before performing Bayesian inference, and the choice of this statistic is often difficult. The summary statistic should be low-dimensional for computational reasons, while retaining as much information as possible about the parameter. Using a recent idea from the interpretable machine learning literature, we develop some regression-based diagnostic methods which are useful for detecting when different parts of a summary statistic vector contain conflicting information about the model parameters. Conflicts of this kind complicate summary statistic choice, and detecting them can be insightful about model deficiencies and guide model improvement. The diagnostic methods developed are based on regression approaches to likelihood-free inference, in which the regression model estimates the posterior density using summary statistics as features. Deletion and imputation of part of the summary statistic vector within the regression model can remove conflicts and approximate posterior distributions for summary statistic subsets. A larger than expected change in the estimated posterior density following deletion and imputation can indicate a conflict in which inferences of interest are affected. The usefulness of the new methods is demonstrated in a number of real examples.