ESPNET-ST：多合一的语音翻译工具包

论文标题

ESPNET-ST：多合一的语音翻译工具包

ESPnet-ST: All-in-One Speech Translation Toolkit

论文作者

Inaguma, Hirofumi, Kiyono, Shun, Duh, Kevin, Karita, Shigeki, Soplin, Nelson Enrique Yalta, Hayashi, Tomoki, Watanabe, Shinji

论文摘要

我们提出ESPNET-ST，该ST旨在在单个框架中快速开发语音到语音翻译系统。 ESPNET-ST是端到端语音处理工具包内的一个新项目，ESPNET，它集成或新实现了自动语音识别，机器翻译以及语音翻译的文本到语音功能。我们为广泛的基准数据集提供多合一的食谱，包括数据预处理，功能提取，培训和解码管道。我们可重现的结果可以匹配甚至超过当前最新性能；这些预训练的模型可下载。该工具包可在https://github.com/espnet/espnet上公开获取。

We present ESPnet-ST, which is designed for the quick development of speech-to-speech translation systems in a single framework. ESPnet-ST is a new project inside end-to-end speech processing toolkit, ESPnet, which integrates or newly implements automatic speech recognition, machine translation, and text-to-speech functions for speech translation. We provide all-in-one recipes including data pre-processing, feature extraction, training, and decoding pipelines for a wide range of benchmark datasets. Our reproducible results can match or even outperform the current state-of-the-art performances; these pre-trained models are downloadable. The toolkit is publicly available at https://github.com/espnet/espnet.

下载PDF全文

下载文献需遵守相关版权规定

论文标题