论文标题
与递归局部表示一致性的大规模无梯度深度学习
Large-Scale Gradient-Free Deep Learning with Recursive Local Representation Alignment
论文作者
论文摘要
在大规模数据集中培训深层神经网络需要大量的硬件资源,这些硬件(即使在云平台上)使它们无法触及小型组织,群体和个人。反向传播是训练这些网络的主力,是一个固有的顺序过程,很难并行化。此外,它要求研究人员不断开发各种技巧,例如专门的权重初始化和激活功能,以确保稳定的参数优化。我们的目标是寻求一种可用于训练深层网络的有效,神经生物学上的替代方案。在本文中,我们提出了一种无梯度的学习程序,即递归的局部表示对准,用于培训大规模的神经体系结构。在CIFAR-10和大型基准ImageNet上进行残留网络的实验表明,由于可行的重量更新和计算的要求较低,因此我们的算法泛化和反向介绍,同时会迅速收敛。这是经验证据,表明无反向算法可以扩展到较大的数据集。
Training deep neural networks on large-scale datasets requires significant hardware resources whose costs (even on cloud platforms) put them out of reach of smaller organizations, groups, and individuals. Backpropagation, the workhorse for training these networks, is an inherently sequential process that is difficult to parallelize. Furthermore, it requires researchers to continually develop various tricks, such as specialized weight initializations and activation functions, in order to ensure a stable parameter optimization. Our goal is to seek an effective, neuro-biologically-plausible alternative to backprop that can be used to train deep networks. In this paper, we propose a gradient-free learning procedure, recursive local representation alignment, for training large-scale neural architectures. Experiments with residual networks on CIFAR-10 and the large benchmark, ImageNet, show that our algorithm generalizes as well as backprop while converging sooner due to weight updates that are parallelizable and computationally less demanding. This is empirical evidence that a backprop-free algorithm can scale up to larger datasets.