论文标题
ADAPMT ICON 2020的NMT模型的域名NMT模型
Domain Adaptation of NMT models for English-Hindi Machine Translation Task at AdapMT ICON 2020
论文作者
论文摘要
事实证明,神经机器翻译(NMT)模型的最新进展已在低资源印度语言的机器翻译上产生最先进的结果。本文介绍了ADAPMT共享任务图标2020中给出的英语印度语言的神经机器翻译系统。共享任务旨在使用特定领域(例如人工智能(AI))和化学反应的印度语言建立翻译系统,并使用小型的域中平行的平行colbus。我们根据BLEU得分评估了两种流行的NMT模型,即英语印度语转换任务的两个流行NMT模型的有效性。我们主要使用域数据数据训练这些模型,并根据内域数据集的特征采用简单的域适应技术。微调和混合域数据方法用于域适应。我们的团队在化学和通用域的翻译任务中排名第一,在AI域En-Hi Translation任务中排名第二。
Recent advancements in Neural Machine Translation (NMT) models have proved to produce a state of the art results on machine translation for low resource Indian languages. This paper describes the neural machine translation systems for the English-Hindi language presented in AdapMT Shared Task ICON 2020. The shared task aims to build a translation system for Indian languages in specific domains like Artificial Intelligence (AI) and Chemistry using a small in-domain parallel corpus. We evaluated the effectiveness of two popular NMT models i.e, LSTM, and Transformer architectures for the English-Hindi machine translation task based on BLEU scores. We train these models primarily using the out of domain data and employ simple domain adaptation techniques based on the characteristics of the in-domain dataset. The fine-tuning and mixed-domain data approaches are used for domain adaptation. Our team was ranked first in the chemistry and general domain En-Hi translation task and second in the AI domain En-Hi translation task.