Ursabench：深度神经网络的近似贝叶斯推理方法的综合基准测试

论文标题

Ursabench：深度神经网络的近似贝叶斯推理方法的综合基准测试

URSABench: Comprehensive Benchmarking of Approximate Bayesian Inference Methods for Deep Neural Networks

论文作者

Vadera, Meet P., Cobb, Adam D., Jalaian, Brian, Marlin, Benjamin M.

论文摘要

尽管深度学习方法在广泛的应用领域上的预测准确性继续提高，但其性能的其他方面仍然存在重大问题，包括量化不确定性和鲁棒性的能力。近似贝叶斯推论的最新进展在解决这些问题方面具有巨大的希望，但是当应用于大型模型时，这些方法的计算可伸缩性可能是有问题的。在本文中，我们描述了有关萨班班的开发（不确定性，鲁棒性，可扩展性和准确基准）的初步工作，这是一种开放源代码套件，用于全面评估近似贝叶斯推理方法，重点是基于深度学习的分类任务

While deep learning methods continue to improve in predictive accuracy on a wide range of application domains, significant issues remain with other aspects of their performance including their ability to quantify uncertainty and their robustness. Recent advances in approximate Bayesian inference hold significant promise for addressing these concerns, but the computational scalability of these methods can be problematic when applied to large-scale models. In this paper, we describe initial work on the development ofURSABench(the Uncertainty, Robustness, Scalability, and Accu-racy Benchmark), an open-source suite of bench-marking tools for comprehensive assessment of approximate Bayesian inference methods with a focus on deep learning-based classification tasks

下载PDF全文

下载文献需遵守相关版权规定

论文标题