NAS在激活和跳过连接搜索下的概括属性

论文标题

NAS在激活和跳过连接搜索下的概括属性

Generalization Properties of NAS under Activation and Skip Connection Search

论文作者

Zhu, Zhenyu, Liu, Fanghui, Chrysos, Grigorios G, Cevher, Volkan

论文摘要

神经建筑搜索（NAS）促进了最新的神经体系结构的自动发现。尽管NAS取得了进展，但到目前为止，NAS对理论保证几乎没有关注。在这项工作中，我们研究了NAS在统一框架下的概括属性，从而实现（深）层跳过连接搜索和激活功能搜索。为此，我们使用一定的搜索空间（包括混合的激活功能，完全连接和残留的神经网络）来得出（in）有限宽度范围（在）有限范围的（in）有限宽度方面的神经切线内核（NTK）的最低特征值的下（和上）边界。我们使用最小特征值来建立NAS在随机梯度下降训练中的概括误差界限。重要的是，我们从理论上和实验上展示了衍生结果如何指导NAS，即使在没有培训的情况下，即使在没有培训的情况下，也可以根据我们的理论进行无火车的算法。因此，我们的数值验证阐明了NAS计算有效方法的设计。由于在统一框架下的各种体系结构和激活功能的耦合，我们的分析是不平凡的，并且在深度学习理论中提供NTK最低特征值的下限具有自身的兴趣。

Neural Architecture Search (NAS) has fostered the automatic discovery of state-of-the-art neural architectures. Despite the progress achieved with NAS, so far there is little attention to theoretical guarantees on NAS. In this work, we study the generalization properties of NAS under a unifying framework enabling (deep) layer skip connection search and activation function search. To this end, we derive the lower (and upper) bounds of the minimum eigenvalue of the Neural Tangent Kernel (NTK) under the (in)finite-width regime using a certain search space including mixed activation functions, fully connected, and residual neural networks. We use the minimum eigenvalue to establish generalization error bounds of NAS in the stochastic gradient descent training. Importantly, we theoretically and experimentally show how the derived results can guide NAS to select the top-performing architectures, even in the case without training, leading to a train-free algorithm based on our theory. Accordingly, our numerical validation shed light on the design of computationally efficient methods for NAS. Our analysis is non-trivial due to the coupling of various architectures and activation functions under the unifying framework and has its own interest in providing the lower bound of the minimum eigenvalue of NTK in deep learning theory.

下载PDF全文

下载文献需遵守相关版权规定

论文标题