实用和样品有效的零击HPO

论文标题

实用和样品有效的零击HPO

Practical and sample efficient zero-shot HPO

论文作者

Winkelmolen, Fela, Ivkin, Nikita, Bozkurt, H. Furkan, Karnin, Zohar

论文摘要

零射击超参数优化（HPO）是一种简单而有效的传递学习，用于构建相互补充的小参数（HP）配置的少量列表。也就是说，对于任何给定的数据集，至少有一个人有望表现良好。当前获取此列表的技术在计算上是昂贵的，因为它们依靠在各种数据集和大量随机绘制的HP集合上运行培训作业。在HPS的空间正常因新算法版本或不断变化的深网架构而定期更改的环境中，此费用尤其有问题。我们提供了可用方法的概述，并介绍了两种新型技术来解决该问题。第一个是基于替代模型的，并且自适应选择了数据集，配置与查询。第二个，对于设置，调查，调整和测试替代模型是有问题的，是一种将超带与子模量优化结合的多保真技术。我们通过实验基准进行五个任务（XGBOOST，LIGHTGBM，CATBOOST，MLP和AUTOML）的方法，与具有相同培训预算的标准零摄像机HPO相比，准确性显着提高。除了贡献新算法外，我们还提供了对零射HPO技术的广泛研究，从而为流行算法提供了（1）默认的超参数，这些算法将使社区受益，（2）大量查找表，以进一步研究超参数调谐。

Zero-shot hyperparameter optimization (HPO) is a simple yet effective use of transfer learning for constructing a small list of hyperparameter (HP) configurations that complement each other. That is to say, for any given dataset, at least one of them is expected to perform well. Current techniques for obtaining this list are computationally expensive as they rely on running training jobs on a diverse collection of datasets and a large collection of randomly drawn HPs. This cost is especially problematic in environments where the space of HPs is regularly changing due to new algorithm versions, or changing architectures of deep networks. We provide an overview of available approaches and introduce two novel techniques to handle the problem. The first is based on a surrogate model and adaptively chooses pairs of dataset, configuration to query. The second, for settings where finding, tuning and testing a surrogate model is problematic, is a multi-fidelity technique combining HyperBand with submodular optimization. We benchmark our methods experimentally on five tasks (XGBoost, LightGBM, CatBoost, MLP and AutoML) and show significant improvement in accuracy compared to standard zero-shot HPO with the same training budget. In addition to contributing new algorithms, we provide an extensive study of the zero-shot HPO technique resulting in (1) default hyper-parameters for popular algorithms that would benefit the community using them, (2) massive lookup tables to further the research of hyper-parameter tuning.

下载PDF全文

下载文献需遵守相关版权规定

论文标题