无反向传播的渐变

论文标题

无反向传播的渐变

Gradients without Backpropagation

论文作者

Baydin, Atılım Güneş, Pearlmutter, Barak A., Syme, Don, Wood, Frank, Torr, Philip

论文摘要

使用反向传播来计算目标函数的优化梯度仍然是机器学习的中流。反向传播或反向模式分化是一般的自动分化算法家族中的一种特殊情况，其中还包括正向模式。我们提出了一种仅基于方向衍生物来计算梯度的方法，即人们可以通过正向模式准确有效地计算。我们将此公式称为正向梯度，这是对梯度的无偏估计，可以在函数的单个正向运行中进行评估，从而完全消除了梯度下降中反向传播的需求。我们在一系列问题中展示了前向梯度下降，在某些情况下可以节省大量计算，并可以迅速训练两倍。

Using backpropagation to compute gradients of objective functions for optimization has remained a mainstay of machine learning. Backpropagation, or reverse-mode differentiation, is a special case within the general family of automatic differentiation algorithms that also includes the forward mode. We present a method to compute gradients based solely on the directional derivative that one can compute exactly and efficiently via the forward mode. We call this formulation the forward gradient, an unbiased estimate of the gradient that can be evaluated in a single forward run of the function, entirely eliminating the need for backpropagation in gradient descent. We demonstrate forward gradient descent in a range of problems, showing substantial savings in computation and enabling training up to twice as fast in some cases.

下载PDF全文

下载文献需遵守相关版权规定

论文标题