多语言模型的零射击性能预测的多任务学习

论文标题

多语言模型的零射击性能预测的多任务学习

Multi Task Learning For Zero Shot Performance Prediction of Multilingual Models

论文作者

Ahuja, Kabir, Kumar, Shanu, Dandapat, Sandipan, Choudhury, Monojit

论文摘要

已经观察到基于多语言的多语言语言模型在跨语言的零射击转移方面非常有效，尽管该语言的性能因语言而异，具体取决于用于微调的枢轴语言。在这项工作中，我们建立了一些现有技术，以通过将其建模为多任务学习问题，以预测任务上的零拍摄性能。我们共同针对不同任务进行培训预测模型，这有助于我们为任务的更准确的预测指标，其中我们以很少的语言测试数据来衡量模型的实际性能。我们的方法还使我们能够执行更强大的功能选择，并确定一组常见的功能，这些功能会影响各种任务中的零拍摄性能。

Massively Multilingual Transformer based Language Models have been observed to be surprisingly effective on zero-shot transfer across languages, though the performance varies from language to language depending on the pivot language(s) used for fine-tuning. In this work, we build upon some of the existing techniques for predicting the zero-shot performance on a task, by modeling it as a multi-task learning problem. We jointly train predictive models for different tasks which helps us build more accurate predictors for tasks where we have test data in very few languages to measure the actual performance of the model. Our approach also lends us the ability to perform a much more robust feature selection and identify a common set of features that influence zero-shot performance across a variety of tasks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题