最大通信网络的优化和概括分析

论文标题

最大通信网络的优化和概括分析

An Optimization and Generalization Analysis for Max-Pooling Networks

论文作者

Brutzkus, Alon, Globerson, Amir

论文摘要

Max-Pooling操作是深度学习体系结构的核心组成部分。特别是，它们是机器视觉中使用的大多数卷积体系结构的一部分，因为合并是一种自然的模式检测问题方法。但是，从理论的角度来看，这些体系结构并未得到很好的理解。例如，我们不了解它们何时可以在全球优化的情况下，以及过度参数化对概括的影响是什么。在这里，我们对卷积最大式架构进行了理论分析，证明它可以在全球优化，甚至可以很好地推广到高度参数化的模型。我们的分析重点是受模式检测问题启发的数据生成分布，在“伪造”模式中需要检测到“歧视性”模式。我们从经验上验证了CNN在我们的环境中明显优于完全连接的网络，如我们的理论结果所预测。

Max-Pooling operations are a core component of deep learning architectures. In particular, they are part of most convolutional architectures used in machine vision, since pooling is a natural approach to pattern detection problems. However, these architectures are not well understood from a theoretical perspective. For example, we do not understand when they can be globally optimized, and what is the effect of over-parameterization on generalization. Here we perform a theoretical analysis of a convolutional max-pooling architecture, proving that it can be globally optimized, and can generalize well even for highly over-parameterized models. Our analysis focuses on a data generating distribution inspired by pattern detection problem, where a "discriminative" pattern needs to be detected among "spurious" patterns. We empirically validate that CNNs significantly outperform fully connected networks in our setting, as predicted by our theoretical results.

下载PDF全文

下载文献需遵守相关版权规定

论文标题