论文标题

ESPNET-ST:多合一的语音翻译工具包

ESPnet-ST: All-in-One Speech Translation Toolkit

论文作者

Inaguma, Hirofumi, Kiyono, Shun, Duh, Kevin, Karita, Shigeki, Soplin, Nelson Enrique Yalta, Hayashi, Tomoki, Watanabe, Shinji

论文摘要

我们提出ESPNET-ST,该ST旨在在单个框架中快速开发语音到语音翻译系统。 ESPNET-ST是端到端语音处理工具包内的一个新项目,ESPNET,它集成或新实现了自动语音识别,机器翻译以及语音翻译的文本到语音功能。我们为广泛的基准数据集提供多合一的食谱,包括数据预处理,功能提取,培训和解码管道。我们可重现的结果可以匹配甚至超过当前最新性能;这些预训练的模型可下载。该工具包可在https://github.com/espnet/espnet上公开获取。

We present ESPnet-ST, which is designed for the quick development of speech-to-speech translation systems in a single framework. ESPnet-ST is a new project inside end-to-end speech processing toolkit, ESPnet, which integrates or newly implements automatic speech recognition, machine translation, and text-to-speech functions for speech translation. We provide all-in-one recipes including data pre-processing, feature extraction, training, and decoding pipelines for a wide range of benchmark datasets. Our reproducible results can match or even outperform the current state-of-the-art performances; these pre-trained models are downloadable. The toolkit is publicly available at https://github.com/espnet/espnet.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源