IWSLT 2022离线共享任务的Yitrans端到端语音翻译系统

论文标题

IWSLT 2022离线共享任务的Yitrans端到端语音翻译系统

The YiTrans End-to-End Speech Translation System for IWSLT 2022 Offline Shared Task

论文作者

Zhang, Ziqiang, Ao, Junyi, Zhou, Long, Liu, Shujie, Wei, Furu, Li, Jinyu

论文摘要

本文介绍了我们针对IWSLT 2022离线任务的端到端Yitrans语音翻译系统的提交，该任务从英语音频转换为德语，中文和日语。 Yitrans系统建立在大规模训练的编码器模型上。更具体地说，我们首先设计了多阶段的预训练策略，以建立具有大量标记和未标记数据的多模式模型。然后，我们为下游语音翻译任务微调模型的相应组件。此外，我们做出了各种努力，以提高性能，例如数据过滤，数据增强，语音细分，模型集合等。实验结果表明，我们的Yitrans系统比在三个翻译方向上的强基线取得了显着改进，并且比去年在TST2021英语 - 德国人中的最佳端到端系统相比，它实现了+5.5 BLEU的改进。根据自动评估指标，我们的最终意见在英语 - 德国和英语端到端系统上排名第一。我们将代码和模型公开可用。

This paper describes the submission of our end-to-end YiTrans speech translation system for the IWSLT 2022 offline task, which translates from English audio to German, Chinese, and Japanese. The YiTrans system is built on large-scale pre-trained encoder-decoder models. More specifically, we first design a multi-stage pre-training strategy to build a multi-modality model with a large amount of labeled and unlabeled data. We then fine-tune the corresponding components of the model for the downstream speech translation tasks. Moreover, we make various efforts to improve performance, such as data filtering, data augmentation, speech segmentation, model ensemble, and so on. Experimental results show that our YiTrans system obtains a significant improvement than the strong baseline on three translation directions, and it achieves +5.2 BLEU improvements over last year's optimal end-to-end system on tst2021 English-German. Our final submissions rank first on English-German and English-Chinese end-to-end systems in terms of the automatic evaluation metric. We make our code and models publicly available.

下载PDF全文

下载文献需遵守相关版权规定

论文标题