多目标神经体系结构搜索几乎没有培训

论文标题

多目标神经体系结构搜索几乎没有培训

Multi-objective Neural Architecture Search with Almost No Training

论文作者

Hu, Shengran, Cheng, Ran, He, Cheng, Lu, Zhichao

论文摘要

在最近的过去，神经建筑搜索（NAS）引起了学术界和行业的越来越多的关注。尽管出现了令人印象深刻的经验结果，但由于随机梯度下降（SGD）训练的昂贵迭代，大多数现有的NAS算法在计算上都无法执行。在这项工作中，我们提出了一种有效的替代方法，称为随机重量评估（RWE），以快速估计网络体系结构的性能。通过仅训练最后一个线性分类层，RWE将评估架构评估的计算成本从小时到几秒钟。当集成在进化的多目标算法中时，RWE在CIFAR-10上获得了一组有效的体系结构，在CIFAR-10上具有最先进的性能，在单个GPU卡上搜索少于两个小时。关于等级相关性和将学习实验转移到ImageNet的消融研究进一步验证了RWE的有效性。

In the recent past, neural architecture search (NAS) has attracted increasing attention from both academia and industries. Despite the steady stream of impressive empirical results, most existing NAS algorithms are computationally prohibitive to execute due to the costly iterations of stochastic gradient descent (SGD) training. In this work, we propose an effective alternative, dubbed Random-Weight Evaluation (RWE), to rapidly estimate the performance of network architectures. By just training the last linear classification layer, RWE reduces the computational cost of evaluating an architecture from hours to seconds. When integrated within an evolutionary multi-objective algorithm, RWE obtains a set of efficient architectures with state-of-the-art performance on CIFAR-10 with less than two hours' searching on a single GPU card. Ablation studies on rank-order correlations and transfer learning experiments to ImageNet have further validated the effectiveness of RWE.

下载PDF全文

下载文献需遵守相关版权规定

论文标题