pict@dravidianlangtech-acl2022：dravidian语言上的神经机器翻译

论文标题

pict@dravidianlangtech-acl2022：dravidian语言上的神经机器翻译

PICT@DravidianLangTech-ACL2022: Neural Machine Translation On Dravidian Languages

论文作者

Vyawahare, Aditya, Tangsali, Rahul, Mandke, Aditya, Litake, Onkar, Kadam, Dipali

论文摘要

本文介绍了我们根据Dravidian语言的机器翻译的共同任务获得的发现的摘要。我们在分配给我们的主要共享任务的五个子任务中的三个中排名第一。我们进行了以下五个语言的神经机器翻译：卡纳达语到泰米尔语，卡纳达语到泰卢固语，卡纳达语，卡纳达人到马拉雅拉姆语，卡纳达语，梵文，梵文和卡纳达语到图卢。五个语言对中每一种的数据集都用于训练各种翻译模型，包括SEQ2SEQ模型，例如LSTM，双向LSTM，Conv2Seq和Training Training-at-Art，作为从头开始的变压器，以及已经预先培训的模型。对于某些涉及单语言语料库的模型，我们也实施了倒退。这些模型的精度随后使用BLEU评分作为评估度量，以同一数据集的一部分进行了测试。

This paper presents a summary of the findings that we obtained based on the shared task on machine translation of Dravidian languages. We stood first in three of the five sub-tasks which were assigned to us for the main shared task. We carried out neural machine translation for the following five language pairs: Kannada to Tamil, Kannada to Telugu, Kannada to Malayalam, Kannada to Sanskrit, and Kannada to Tulu. The datasets for each of the five language pairs were used to train various translation models, including Seq2Seq models such as LSTM, bidirectional LSTM, Conv2Seq, and training state-of-the-art as transformers from scratch, and fine-tuning already pre-trained models. For some models involving monolingual corpora, we implemented backtranslation as well. These models' accuracy was later tested with a part of the same dataset using BLEU score as an evaluation metric.

下载PDF全文

下载文献需遵守相关版权规定

论文标题