循环可区分架构搜索

论文标题

循环可区分架构搜索

Cyclic Differentiable Architecture Search

论文作者

Yu, Hongyuan, Peng, Houwen, Huang, Yan, Fu, Jianlong, Du, Hao, Wang, Liang, Ling, Haibin

论文摘要

可微不足道的架构搜索，即飞镖，引起了神经体系结构搜索的极大关注。它试图在浅搜索网络中找到最佳体系结构，然后在深度评估网络中测量其性能。但是，搜索和评估网络的独立优化为潜在的改进空间提供了允许在两个网络之间的相互作用的空间。为了解决有问题的优化问题，我们提出了新的关节优化目标和新型的环节架构搜索框架，称为CDARTS。考虑到结构差异，CDARTs通过内省蒸馏构建了搜索和评估网络之间的环状反馈机制。首先，搜索网络生成用于评估的初始体系结构，并优化了评估网络的权重。其次，搜索网络中的体系结构权重通过分类标签监督以及通过功能蒸馏的评估网络的正则化进一步优化。重复上述周期会导致搜索和评估网络的联合优化，从而使体系结构的演变适合最终评估网络。对CIFAR，Imagenet和Nas-Bench-201的实验和分析证明了所提出的方法对最先进的方法的有效性。具体而言，在飞镖搜索空间中，我们在CIFAR10上获得了97.52％的TOP-1准确性，而ImageNet上的Top-1精度为76.3％。在链条结构的搜索空间中，我们在ImageNet上获得了78.2％的TOP-1准确性，比EfficityNet-B0高1.1％。我们的代码和模型可在https://github.com/microsoft/cream上公开获取。

Differentiable ARchiTecture Search, i.e., DARTS, has drawn great attention in neural architecture search. It tries to find the optimal architecture in a shallow search network and then measures its performance in a deep evaluation network. The independent optimization of the search and evaluation networks, however, leaves room for potential improvement by allowing interaction between the two networks. To address the problematic optimization issue, we propose new joint optimization objectives and a novel Cyclic Differentiable ARchiTecture Search framework, dubbed CDARTS. Considering the structure difference, CDARTS builds a cyclic feedback mechanism between the search and evaluation networks with introspective distillation. First, the search network generates an initial architecture for evaluation, and the weights of the evaluation network are optimized. Second, the architecture weights in the search network are further optimized by the label supervision in classification, as well as the regularization from the evaluation network through feature distillation. Repeating the above cycle results in joint optimization of the search and evaluation networks and thus enables the evolution of the architecture to fit the final evaluation network. The experiments and analysis on CIFAR, ImageNet and NAS-Bench-201 demonstrate the effectiveness of the proposed approach over the state-of-the-art ones. Specifically, in the DARTS search space, we achieve 97.52% top-1 accuracy on CIFAR10 and 76.3% top-1 accuracy on ImageNet. In the chain-structured search space, we achieve 78.2% top-1 accuracy on ImageNet, which is 1.1% higher than EfficientNet-B0. Our code and models are publicly available at https://github.com/microsoft/Cream.

下载PDF全文

下载文献需遵守相关版权规定

论文标题