置信度集和假设测试在无似然推理的环境中

论文标题

置信度集和假设测试在无似然推理的环境中

Confidence Sets and Hypothesis Testing in a Likelihood-Free Inference Setting

论文作者

Dalmasso, Niccolò, Izbicki, Rafael, Lee, Ann B.

论文摘要

参数估计，统计检验和置信集是经典统计的基石，允许科学家推断出生成观察到的数据的基础过程。一个关键问题是，是否仍然可以在所谓的无可能推理（LFI）设置中构建具有适当覆盖范围和高功率的假设检验和置信度设置；也就是说，这种可能性未明确知道，但可以根据随机模型向观察数据进行前进数据的设置。在本文中，我们介绍了$ \ texttt {acore} $（通过赔率估计进行了近似计算），这是一种常见的LFI方法，首先将经典的可能性比率测试（LRT）作为参数化的分类问题，然后将测试和置信度设置为对参数的构建置信区的等价。我们还提出了一个拟合的方法，用于检查构建的测试和置信区是否有效。 $ \ texttt {acore} $基于关键观察，即LRT统计量，测试的拒绝概率和置信集的覆盖范围是条件分布功能，通常会随着感兴趣的参数的函数而平稳变化。因此，与其仅依靠在固定参数设置上模拟的样品（如标准Monte Carlo Solutions中的约定），还可以利用参数附近模拟的机器学习工具和数据来提高感兴趣量的估计值。我们证明了$ \ texttt {acore} $具有理论和经验结果的功效。我们的实施可在GitHub上获得。

Parameter estimation, statistical tests and confidence sets are the cornerstones of classical statistics that allow scientists to make inferences about the underlying process that generated the observed data. A key question is whether one can still construct hypothesis tests and confidence sets with proper coverage and high power in a so-called likelihood-free inference (LFI) setting; that is, a setting where the likelihood is not explicitly known but one can forward-simulate observable data according to a stochastic model. In this paper, we present $\texttt{ACORE}$ (Approximate Computation via Odds Ratio Estimation), a frequentist approach to LFI that first formulates the classical likelihood ratio test (LRT) as a parametrized classification problem, and then uses the equivalence of tests and confidence sets to build confidence regions for parameters of interest. We also present a goodness-of-fit procedure for checking whether the constructed tests and confidence regions are valid. $\texttt{ACORE}$ is based on the key observation that the LRT statistic, the rejection probability of the test, and the coverage of the confidence set are conditional distribution functions which often vary smoothly as a function of the parameters of interest. Hence, instead of relying solely on samples simulated at fixed parameter settings (as is the convention in standard Monte Carlo solutions), one can leverage machine learning tools and data simulated in the neighborhood of a parameter to improve estimates of quantities of interest. We demonstrate the efficacy of $\texttt{ACORE}$ with both theoretical and empirical results. Our implementation is available on Github.

下载PDF全文

下载文献需遵守相关版权规定

论文标题