重新思考默认值：定义超参数的低成本和高效策略

论文标题

重新思考默认值：定义超参数的低成本和高效策略

Rethinking Default Values: a Low Cost and Efficient Strategy to Define Hyperparameters

论文作者

Mantovani, Rafael Gomes, Rossi, André Luis Debiaso, Alcobaça, Edesio, Gertrudes, Jadson Castro, Junior, Sylvio Barbon, de Carvalho, André Carlos Ponce de Leon Ferreira

论文摘要

机器学习（ML）算法已越来越多地应用于几个不同领域的问题。尽管它们的知名度越来越大，但他们的预测性能通常会受到分配给其超参数（HP）的值的影响。结果，研究人员和从业人员面临如何设定这些价值观的挑战。许多用户对ML算法及其HP值的影响有限，因此不利用合适的设置。他们通常通过反复试验来定义HP值，这是非常主观的，不能保证找到良好的值并取决于用户体验。调整技术搜索能够最大程度地提高给定数据集的诱导模型的预测性能，但具有高计算成本的缺点。因此，从业者使用算法开发人员建议的默认值或实现算法的工具。尽管默认值通常会导致具有可接受的预测性能的模型，但同一算法的不同实现可以提出不同的默认值。为了在调整和使用默认值之间保持平衡，我们提出了一种生成新的优化默认值的策略。我们的方法基于一小部分优化值，能够比流行工具提供的默认设置更好地获得预测性能值。在进行大型实验并对结果进行仔细的分析之后，我们得出结论，我们的方法可以提供更好的默认值。此外，与调谐值相比，它会导致竞争性解决方案，从而使其更易于使用并具有较低的成本。我们还提取了简单的规则，以指导从业人员决定是使用我们的新方法还是HP调整方法。

Machine Learning (ML) algorithms have been increasingly applied to problems from several different areas. Despite their growing popularity, their predictive performance is usually affected by the values assigned to their hyperparameters (HPs). As consequence, researchers and practitioners face the challenge of how to set these values. Many users have limited knowledge about ML algorithms and the effect of their HP values and, therefore, do not take advantage of suitable settings. They usually define the HP values by trial and error, which is very subjective, not guaranteed to find good values and dependent on the user experience. Tuning techniques search for HP values able to maximize the predictive performance of induced models for a given dataset, but have the drawback of a high computational cost. Thus, practitioners use default values suggested by the algorithm developer or by tools implementing the algorithm. Although default values usually result in models with acceptable predictive performance, different implementations of the same algorithm can suggest distinct default values. To maintain a balance between tuning and using default values, we propose a strategy to generate new optimized default values. Our approach is grounded on a small set of optimized values able to obtain predictive performance values better than default settings provided by popular tools. After performing a large experiment and a careful analysis of the results, we concluded that our approach delivers better default values. Besides, it leads to competitive solutions when compared to tuned values, making it easier to use and having a lower cost. We also extracted simple rules to guide practitioners in deciding whether to use our new methodology or a HP tuning approach.

下载PDF全文

下载文献需遵守相关版权规定

论文标题