通过非神经架构搜索的非神经模型的准确性预测

论文标题

通过非神经架构搜索的非神经模型的准确性预测

Accuracy Prediction with Non-neural Model for Neural Architecture Search

论文作者

Luo, Renqian, Tan, Xu, Wang, Rui, Qin, Tao, Chen, Enhong, Liu, Tie-Yan

论文摘要

具有准确性预测指标的神经体系结构搜索（NAS）预测候选体系结构的准确性，由于其简单性和有效性引起了人们的注意。以前的作品通常采用基于神经网络的预测指标，需要更精致的设计，并且易于过度合适。考虑到大多数架构被表示为离散符号的序列，这些符号更像是表格数据，而不是非神经预测变量，因此在本文中，我们研究了一种替代方法，该方法使用非神经模型进行准确性预测。具体而言，由于基于决策树的模型可以更好地处理表格数据，因此我们利用梯度提升决策树（GBDT）作为NAS的预测指标。我们证明，与基于神经网络的预测变量相比，GBDT预测变量可以实现可比较的预测准确性。此外，考虑到紧凑的搜索空间可以缓解搜索过程，我们建议根据GBDT衍生的重要功能逐渐修剪搜索空间。这样，可以通过首先修剪搜索空间然后搜索神经体系结构来执行NAS，这更有效。在NASBENCH-101和IMAGENET上进行的实验证明了使用GBDT作为NAS的预测因子的有效性：（1）在NASBENCH-101上，它比随机搜索，正规化进化和蒙特卡洛树搜索（MCT）在查找全局最佳距离方面的样品效率高22x，8x和6倍；（2）它在ImageNet上达到了24.2％的TOP-1错误率，并在搜索空间修剪增强时进一步达到ImageNet上的TOP-1错误率23.4％。代码可在https://github.com/renqianluo/gbdt-nas上提供。

Neural architecture search (NAS) with an accuracy predictor that predicts the accuracy of candidate architectures has drawn increasing attention due to its simplicity and effectiveness. Previous works usually employ neural network-based predictors which require more delicate design and are easy to overfit. Considering that most architectures are represented as sequences of discrete symbols which are more like tabular data and preferred by non-neural predictors, in this paper, we study an alternative approach which uses non-neural model for accuracy prediction. Specifically, as decision tree based models can better handle tabular data, we leverage gradient boosting decision tree (GBDT) as the predictor for NAS. We demonstrate that the GBDT predictor can achieve comparable (if not better) prediction accuracy than neural network based predictors. Moreover, considering that a compact search space can ease the search process, we propose to prune the search space gradually according to important features derived from GBDT. In this way, NAS can be performed by first pruning the search space and then searching a neural architecture, which is more efficient and effective. Experiments on NASBench-101 and ImageNet demonstrate the effectiveness of using GBDT as predictor for NAS: (1) On NASBench-101, it is 22x, 8x, and 6x more sample efficient than random search, regularized evolution, and Monte Carlo Tree Search (MCTS) in finding the global optimum; (2) It achieves 24.2% top-1 error rate on ImageNet, and further achieves 23.4% top-1 error rate on ImageNet when enhanced with search space pruning. Code is provided at https://github.com/renqianluo/GBDT-NAS.

下载PDF全文

下载文献需遵守相关版权规定

论文标题