论文标题

使用实验室大小的资源培训T5

Training a T5 Using Lab-sized Resources

论文作者

Ciosici, Manuel R., Derczynski, Leon

论文摘要

在大型数据集中培训大型神经语言模型是资源和时间密集型的。这些要求造成了进入的障碍,其中资源较少的人无法建立竞争模型。本文介绍了各种技术,以使(a)使用适中的研究实验室可能拥有的资源进行培训的大型语言模型,以及(b)在合理的时间内训练它。我们为从业人员提供具体的建议,我们通过案例研究来说明这一点:丹麦的T5模型,第一种语言。

Training large neural language models on large datasets is resource- and time-intensive. These requirements create a barrier to entry, where those with fewer resources cannot build competitive models. This paper presents various techniques for making it possible to (a) train a large language model using resources that a modest research lab might have, and (b) train it in a reasonable amount of time. We provide concrete recommendations for practitioners, which we illustrate with a case study: a T5 model for Danish, the first for this language.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源