论文标题
深神经网络隐藏单元之间基于标签的多样性度量:一种正则化方法
Label-Based Diversity Measure Among Hidden Units of Deep Neural Networks: A Regularization Method
论文作者
论文摘要
尽管深层结构保证了深网(DNN)的强大表现力,但它也触发了严重的过度拟合问题。为了提高DNN的概括能力,制定了许多策略来改善隐藏单元之间的多样性。但是,这些策略中的大多数都是经验和启发式,而没有理论上的多样性度量或从多样性到概括能力的明确联系。在本文中,从信息理论的角度来看,我们介绍了一个新的冗余定义,以通过将隐藏层对概括能力的效果形式化为相互信息来描述监督学习环境下隐藏单元的多样性。我们证明了定义的冗余能力与概括能力之间存在的相反关系,即,冗余的下降通常提高了概括能力。实验表明,使用冗余的DNNs作为正则器可以有效地减少过度拟合并减少概括误差,这很好地支持了上方的点。
Although the deep structure guarantees the powerful expressivity of deep networks (DNNs), it also triggers serious overfitting problem. To improve the generalization capacity of DNNs, many strategies were developed to improve the diversity among hidden units. However, most of these strategies are empirical and heuristic in absence of either a theoretical derivation of the diversity measure or a clear connection from the diversity to the generalization capacity. In this paper, from an information theoretic perspective, we introduce a new definition of redundancy to describe the diversity of hidden units under supervised learning settings by formalizing the effect of hidden layers on the generalization capacity as the mutual information. We prove an opposite relationship existing between the defined redundancy and the generalization capacity, i.e., the decrease of redundancy generally improving the generalization capacity. The experiments show that the DNNs using the redundancy as the regularizer can effectively reduce the overfitting and decrease the generalization error, which well supports above points.