论文标题
通过网络梯度优化限制的玻尔兹曼机器的连通性
Optimizing Connectivity through Network Gradients for Restricted Boltzmann Machines
论文作者
论文摘要
最近证明利用稀疏网络连接深神经网络中的连续层,可为大规模的最新模型提供好处。但是,网络连接性在浅网络的学习性能中也起着重要作用,例如经典限制的玻尔兹曼机器(RBM)。有效地找到稀疏的连接模式,以提高浅网络的学习绩效是一个基本问题。尽管最近的原则方法明确将网络连接作为必须优化的模型参数,但它们通常依赖于明确的惩罚或网络稀疏性作为超参数。这项工作介绍了网络连接梯度(NCG),这是一种优化方法,可以找到RBM的最佳连接模式。 NCG利用网络梯度的概念:给定特定的连接模式,它决定了每个可能的连接的梯度,并使用梯度驱动连续连接强度参数又用于确定连接模式。因此,学习RBM参数和学习网络连接是真正共同执行的,尽管学习率不同,并且没有改变模型的经典基于能量的目标函数。提出的方法应用于MNIST和其他数据集,表明在样本生成和分类的基准任务中找到了更好的RBM模型。结果还表明,NCG对网络初始化是可靠的,并且能够在学习时添加和删除网络连接。
Leveraging sparse networks to connect successive layers in deep neural networks has recently been shown to provide benefits to large-scale state-of-the-art models. However, network connectivity also plays a significant role in the learning performance of shallow networks, such as the classic Restricted Boltzmann Machine (RBM). Efficiently finding sparse connectivity patterns that improve the learning performance of shallow networks is a fundamental problem. While recent principled approaches explicitly include network connections as model parameters that must be optimized, they often rely on explicit penalization or network sparsity as a hyperparameter. This work presents the Network Connectivity Gradients (NCG), an optimization method to find optimal connectivity patterns for RBMs. NCG leverages the idea of network gradients: given a specific connection pattern, it determines the gradient of every possible connection and uses the gradient to drive a continuous connection strength parameter that in turn is used to determine the connection pattern. Thus, learning RBM parameters and learning network connections is truly jointly performed, albeit with different learning rates, and without changes to the model's classic energy-based objective function. The proposed method is applied to the MNIST and other data sets showing that better RBM models are found for the benchmark tasks of sample generation and classification. Results also show that NCG is robust to network initialization and is capable of both adding and removing network connections while learning.