论文标题
深神经网络中冷后代的统计理论
A statistical theory of cold posteriors in deep neural networks
论文作者
论文摘要
为了使贝叶斯神经网络与标准神经网络相当地执行,通常需要使用“钢化”或“冷”后部人为地降低不确定性。这是非常令人担忧的:如果先验是准确的,贝叶斯的推理/决策理论是最佳的,并且任何人工变化都会损害性能。尽管这表明先前的可能是过错的,但我们认为实际上,用于图像分类的BNN使用错误的可能性。特别是,仔细策划了标准图像基准数据集(例如CIFAR-10)。我们开发了一种描述策划的生成模型,该模型给出了贝叶斯的原则性贝叶斯说明,因为这种新的生成模型下的可能性与过去工作中使用的脾气暴躁的可能性非常匹配。
To get Bayesian neural networks to perform comparably to standard neural networks it is usually necessary to artificially reduce uncertainty using a "tempered" or "cold" posterior. This is extremely concerning: if the prior is accurate, Bayes inference/decision theory is optimal, and any artificial changes to the posterior should harm performance. While this suggests that the prior may be at fault, here we argue that in fact, BNNs for image classification use the wrong likelihood. In particular, standard image benchmark datasets such as CIFAR-10 are carefully curated. We develop a generative model describing curation which gives a principled Bayesian account of cold posteriors, because the likelihood under this new generative model closely matches the tempered likelihoods used in past work.