论文标题

不确定性量化和深层合奏

Uncertainty Quantification and Deep Ensembles

论文作者

Rahaman, Rahul, Thiery, Alexandre H.

论文摘要

众所周知,深度学习方法会遇到校准问题:它们通常会产生过度自信的估计。这些问题在低数据制度中加剧了。尽管对概率模型的校准进行了充分的研究,但在低数据制度中校准了极为过度的模型,但却带来了独特的挑战。我们表明,深度谐振不一定会提高校准特性。实际上,我们表明,当与现代技术(例如混合正则化)结合使用时,标准结合方法可能导致校准模型较少。本文研究了三种最简单和常用的方法之间的相互作用,以便在数据稀缺时利用深度学习:数据启动,结合和后处理校准方法。尽管标准的结合技术无疑有助于提高准确性,但我们证明了深层合奏的校准依赖于微妙的权衡。我们还发现,校准方法(例如温度缩放)在与深度谐振时需要稍微调整,并且至关重要的是,在平均过程后需要执行。我们的模拟表明,与低数据制度中的标准深度股相比,这种简单的策略可以使一系列基准分类问题的预期校准误差(ECE)减半。

Deep Learning methods are known to suffer from calibration issues: they typically produce over-confident estimates. These problems are exacerbated in the low data regime. Although the calibration of probabilistic models is well studied, calibrating extremely over-parametrized models in the low-data regime presents unique challenges. We show that deep-ensembles do not necessarily lead to improved calibration properties. In fact, we show that standard ensembling methods, when used in conjunction with modern techniques such as mixup regularization, can lead to less calibrated models. This text examines the interplay between three of the most simple and commonly used approaches to leverage deep learning when data is scarce: data-augmentation, ensembling, and post-processing calibration methods. Although standard ensembling techniques certainly help boost accuracy, we demonstrate that the calibration of deep ensembles relies on subtle trade-offs. We also find that calibration methods such as temperature scaling need to be slightly tweaked when used with deep-ensembles and, crucially, need to be executed after the averaging process. Our simulations indicate that this simple strategy can halve the Expected Calibration Error (ECE) on a range of benchmark classification problems compared to standard deep-ensembles in the low data regime.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源