Simuleval：同时翻译的评估工具包

论文标题

Simuleval：同时翻译的评估工具包

SimulEval: An Evaluation Toolkit for Simultaneous Translation

论文作者

Ma, Xutai, Dousti, Mohammad Javad, Wang, Changhan, Gu, Jiatao, Pino, Juan

论文摘要

在文本和语音上同时翻译侧重于实时和低延迟场景，在阅读完整的源输入之前，该模型开始翻译。评估同时翻译模型比离线模型更复杂，因为除了翻译质量外，延迟是另一个要考虑的因素。该研究社区尽管越来越关注新颖的建模方法来同时翻译，但目前缺乏普遍的评估程序。因此，我们提出了Simuleval，这是同时文本和语音翻译的易于使用和一般评估工具包。引入了服务器 - 灯泡方案以创建同时的翻译方案，在该方案中，服务器发送源输入并接收评估的预测，并且客户端执行自定义的策略。鉴于策略，它会自动执行同时解码，并共同报告几个流行的延迟指标。我们还将延迟指标从文本同时翻译中调整为语音任务。此外，Simuleval配备了可视化接口，以更好地了解系统的同时解码过程。 Simuleval已经被广泛用于IWSLT 2020同时语音翻译的共享任务。代码将在出版后发布。

Simultaneous translation on both text and speech focuses on a real-time and low-latency scenario where the model starts translating before reading the complete source input. Evaluating simultaneous translation models is more complex than offline models because the latency is another factor to consider in addition to translation quality. The research community, despite its growing focus on novel modeling approaches to simultaneous translation, currently lacks a universal evaluation procedure. Therefore, we present SimulEval, an easy-to-use and general evaluation toolkit for both simultaneous text and speech translation. A server-client scheme is introduced to create a simultaneous translation scenario, where the server sends source input and receives predictions for evaluation and the client executes customized policies. Given a policy, it automatically performs simultaneous decoding and collectively reports several popular latency metrics. We also adapt latency metrics from text simultaneous translation to the speech task. Additionally, SimulEval is equipped with a visualization interface to provide better understanding of the simultaneous decoding process of a system. SimulEval has already been extensively used for the IWSLT 2020 shared task on simultaneous speech translation. Code will be released upon publication.

下载PDF全文

下载文献需遵守相关版权规定

论文标题