论文标题

使用贝叶斯学习规则培训二进制神经网络

Training Binary Neural Networks using the Bayesian Learning Rule

论文作者

Meng, Xiangming, Bachmann, Roman, Khan, Mohammad Emtiyaz

论文摘要

具有二进制重量的神经网络是计算效率且对硬件友好的,但是它们的培训具有挑战性,因为它涉及离散的优化问题。令人惊讶的是,忽略了问题的离散性质,并使用基于梯度的方法(例如直通估计器)在实践中仍然效果很好。这提出了一个问题:是否有原则性的方法可以证明这种方法合理?在本文中,我们使用贝叶斯学习规则提出了这种方法。该规则应用于估计二进制重量上的伯努利分布时,会导致算法,这证明了以前方法做出的一些算法选择合理。该算法不仅获得了最先进的性能,而且还可以估算不确定性的持续学习,以避免灾难性的遗忘。我们的工作为培训二进制神经网络提供了一种原则性的方法,该方法可以证明并扩展现有方法。

Neural networks with binary weights are computation-efficient and hardware-friendly, but their training is challenging because it involves a discrete optimization problem. Surprisingly, ignoring the discrete nature of the problem and using gradient-based methods, such as the Straight-Through Estimator, still works well in practice. This raises the question: are there principled approaches which justify such methods? In this paper, we propose such an approach using the Bayesian learning rule. The rule, when applied to estimate a Bernoulli distribution over the binary weights, results in an algorithm which justifies some of the algorithmic choices made by the previous approaches. The algorithm not only obtains state-of-the-art performance, but also enables uncertainty estimation for continual learning to avoid catastrophic forgetting. Our work provides a principled approach for training binary neural networks which justifies and extends existing approaches.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源