神经建筑搜索中的层间过渡

论文标题

神经建筑搜索中的层间过渡

Inter-layer Transition in Neural Architecture Search

论文作者

Ma, Benteng, Zhang, Jing, Xia, Yong, Tao, Dacheng

论文摘要

微分神经体系结构搜索（NAS）方法表示网络体系结构是重复代理的定向无环图（DAG），并以不同的方式交替优化网络权重和体系结构。但是，现有方法将每个边缘上的体系结构（即网络中的一层）建模为统计自动变量，忽略了其定向拓扑连接引起的DAG中边缘之间的依赖关系。在本文中，我们首次尝试通过提出一种新型的层间过渡NAS方法来研究这种依赖性。它将体系结构优化投入到一个顺序决策过程中，其中明确建模了连接边缘的体系结构权重之间的依赖关系。具体而言，边缘根据其前身边缘是否在同一单元格中分为内部和外部组。尽管外边缘的架构权重是独立优化的，但内部边缘的重量是根据其前身边缘的架构权重依次得出的，并以细心的概率过渡方式来得出了可学习的过渡矩阵。五个基准的实验证实了建模层间依赖性的价值，并证明了所提出的方法优于最先进的方法。

Differential Neural Architecture Search (NAS) methods represent the network architecture as a repetitive proxy directed acyclic graph (DAG) and optimize the network weights and architecture weights alternatively in a differential manner. However, existing methods model the architecture weights on each edge (i.e., a layer in the network) as statistically independent variables, ignoring the dependency between edges in DAG induced by their directed topological connections. In this paper, we make the first attempt to investigate such dependency by proposing a novel Inter-layer Transition NAS method. It casts the architecture optimization into a sequential decision process where the dependency between the architecture weights of connected edges is explicitly modeled. Specifically, edges are divided into inner and outer groups according to whether or not their predecessor edges are in the same cell. While the architecture weights of outer edges are optimized independently, those of inner edges are derived sequentially based on the architecture weights of their predecessor edges and the learnable transition matrices in an attentive probability transition manner. Experiments on five benchmarks confirm the value of modeling inter-layer dependency and demonstrate the proposed method outperforms state-of-the-art methods.

下载PDF全文

下载文献需遵守相关版权规定

论文标题