论文标题

课程何时工作?

When Do Curricula Work?

论文作者

Wu, Xiaoxia, Dyer, Ethan, Neyshabur, Behnam

论文摘要

受到人类学习的启发,研究人员根据他们的难度提出了培训期间的订购例子。课程学习,将网络暴露在培训早期的示例中,以及反学学习(首先显示最困难的例子),这两者都被认为是对标准I.I.D.的改进。训练。在这项工作中,我们着手研究有序学习的相对好处。我们首先研究了由架构和优化偏置产生的\ emph {隐式课程},并发现样品是以高度一致的顺序学习的。接下来,为了量化\ emph {显式课程}的益处,我们对数千个订单进行了广泛的实验,这些订单涵盖了三种学习:课程,抗议学和随机课程 - 随着时间的推移,训练数据集的大小会随着时间的推移而动态增加,但是示例随机订购了。我们发现,对于标准基准数据集,课程仅具有边际收益,并且随机订购的样品的性能均高于课程和反疗法,这表明任何益处完全是由于动态训练集的大小所致。受实践中课程学习的常见用例的启发,我们研究了有限的培训时间预算和嘈杂数据在课程学习成功中的作用。我们的实验表明,通过有限的培训时间预算或存在嘈杂的数据,课程确实可以改善表现。

Inspired by human learning, researchers have proposed ordering examples during training based on their difficulty. Both curriculum learning, exposing a network to easier examples early in training, and anti-curriculum learning, showing the most difficult examples first, have been suggested as improvements to the standard i.i.d. training. In this work, we set out to investigate the relative benefits of ordered learning. We first investigate the \emph{implicit curricula} resulting from architectural and optimization bias and find that samples are learned in a highly consistent order. Next, to quantify the benefit of \emph{explicit curricula}, we conduct extensive experiments over thousands of orderings spanning three kinds of learning: curriculum, anti-curriculum, and random-curriculum -- in which the size of the training dataset is dynamically increased over time, but the examples are randomly ordered. We find that for standard benchmark datasets, curricula have only marginal benefits, and that randomly ordered samples perform as well or better than curricula and anti-curricula, suggesting that any benefit is entirely due to the dynamic training set size. Inspired by common use cases of curriculum learning in practice, we investigate the role of limited training time budget and noisy data in the success of curriculum learning. Our experiments demonstrate that curriculum, but not anti-curriculum can indeed improve the performance either with limited training time budget or in existence of noisy data.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源