激活松弛：局部动态近似与大脑的反向传播

论文标题

激活松弛：局部动态近似与大脑的反向传播

Activation Relaxation: A Local Dynamical Approximation to Backpropagation in the Brain

论文作者

Millidge, Beren, Tschantz, Alexander, Seth, Anil K, Buckley, Christopher L

论文摘要

错误算法（Backprop）的反向传播对深度学习的最新成功发挥了作用。但是，仍然是一个关键问题，即是否可以以适合在神经回路中实施的方式制定反向。主要的挑战是确保任何候选公式仅使用本地信息，而不是依靠标准反向版中的全局信号。最近，仅提出了几种仅使用局部信号近似反向物的算法。但是，这些算法通常会施加其他挑战生物学合理性的要求：例如，需要复杂而精确的连通性方案，或多个顺序向后阶段，并在各个阶段存储信息。在这里，我们提出了一种新颖的算法，即激活弛豫（AR），该算法是通过构建向反向传播梯度作为动力学系统平衡点而进行的。我们的算法快速，稳健地收敛到正确的反向传播梯度，仅需要一种类型的计算单元，仅利用一个平行的向后放松阶段，并且可以在任意计算图上操作。我们通过在视觉分类任务上训练深层神经网络来说明这些属性，并描述算法的简化，从而消除了神经生物学实现的进一步障碍（例如，权重传输问题以及使用非线性衍生物的使用），同时保留性能。

The backpropagation of error algorithm (backprop) has been instrumental in the recent success of deep learning. However, a key question remains as to whether backprop can be formulated in a manner suitable for implementation in neural circuitry. The primary challenge is to ensure that any candidate formulation uses only local information, rather than relying on global signals as in standard backprop. Recently several algorithms for approximating backprop using only local signals have been proposed. However, these algorithms typically impose other requirements which challenge biological plausibility: for example, requiring complex and precise connectivity schemes, or multiple sequential backwards phases with information being stored across phases. Here, we propose a novel algorithm, Activation Relaxation (AR), which is motivated by constructing the backpropagation gradient as the equilibrium point of a dynamical system. Our algorithm converges rapidly and robustly to the correct backpropagation gradients, requires only a single type of computational unit, utilises only a single parallel backwards relaxation phase, and can operate on arbitrary computation graphs. We illustrate these properties by training deep neural networks on visual classification tasks, and describe simplifications to the algorithm which remove further obstacles to neurobiological implementation (for example, the weight-transport problem, and the use of nonlinear derivatives), while preserving performance.

下载PDF全文

下载文献需遵守相关版权规定

论文标题