超越Neyman-Pearson：电子价值通过数据驱动的alpha进行假设测试

论文标题

超越Neyman-Pearson：电子价值通过数据驱动的alpha进行假设测试

Beyond Neyman-Pearson: e-values enable hypothesis testing with a data-driven alpha

论文作者

Grünwald, Peter

论文摘要

统计假设检验的标准实践是提及接受/拒绝决定的p值。我们展示了提及电子价值的优势。使用p值，尚不清楚如何使用极端观察（例如p $ \llα$）来获得更好的频繁决定。使用电子价值，它很简单，因为它们在总体化的Neyman-Pearson设置中提供了I型风险控制，并在观察到数据后确定了事后确定的决策任务（一般损失函数） - 从而为“ Roving $α$'s”提供了手柄。当考虑到II型风险时，事后设置中唯一可接受的决策规则是基于电子价值的。同样，如果指定有缺陷置信区间的损失未提前固定，则标准置信区间和分布可能会失败，而电子信心集和电子寄养者仍然提供有效的风险保证。到目前为止，已经为一系列经典测试问题开发了足够强大的电子价值。我们讨论了更广泛发展和部署的主要挑战。

A standard practice in statistical hypothesis testing is to mention the p-value alongside the accept/reject decision. We show the advantages of mentioning an e-value instead. With p-values, it is not clear how to use an extreme observation (e.g. p $\ll α$) for getting better frequentist decisions. With e-values it is straightforward, since they provide Type-I risk control in a generalized Neyman-Pearson setting with the decision task (a general loss function) determined post-hoc, after observation of the data -- thereby providing a handle on `roving $α$'s'. When Type-II risks are taken into consideration, the only admissible decision rules in the post-hoc setting turn out to be e-value-based. Similarly, if the loss incurred when specifying a faulty confidence interval is not fixed in advance, standard confidence intervals and distributions may fail whereas e-confidence sets and e-posteriors still provide valid risk guarantees. Sufficiently powerful e-values have by now been developed for a range of classical testing problems. We discuss the main challenges for wider development and deployment.

下载PDF全文

下载文献需遵守相关版权规定

论文标题