使用贝叶斯学习规则培训二进制神经网络

论文标题

使用贝叶斯学习规则培训二进制神经网络

Training Binary Neural Networks using the Bayesian Learning Rule

论文作者

Meng, Xiangming, Bachmann, Roman, Khan, Mohammad Emtiyaz

论文摘要

具有二进制重量的神经网络是计算效率且对硬件友好的，但是它们的培训具有挑战性，因为它涉及离散的优化问题。令人惊讶的是，忽略了问题的离散性质，并使用基于梯度的方法（例如直通估计器）在实践中仍然效果很好。这提出了一个问题：是否有原则性的方法可以证明这种方法合理？在本文中，我们使用贝叶斯学习规则提出了这种方法。该规则应用于估计二进制重量上的伯努利分布时，会导致算法，这证明了以前方法做出的一些算法选择合理。该算法不仅获得了最先进的性能，而且还可以估算不确定性的持续学习，以避免灾难性的遗忘。我们的工作为培训二进制神经网络提供了一种原则性的方法，该方法可以证明并扩展现有方法。

Neural networks with binary weights are computation-efficient and hardware-friendly, but their training is challenging because it involves a discrete optimization problem. Surprisingly, ignoring the discrete nature of the problem and using gradient-based methods, such as the Straight-Through Estimator, still works well in practice. This raises the question: are there principled approaches which justify such methods? In this paper, we propose such an approach using the Bayesian learning rule. The rule, when applied to estimate a Bernoulli distribution over the binary weights, results in an algorithm which justifies some of the algorithmic choices made by the previous approaches. The algorithm not only obtains state-of-the-art performance, but also enables uncertainty estimation for continual learning to avoid catastrophic forgetting. Our work provides a principled approach for training binary neural networks which justifies and extends existing approaches.

下载PDF全文

下载文献需遵守相关版权规定

论文标题