论文标题

Javanese,Sundanese,Balinese和Bataks语音识别与合成的跨语言机器语音链

Cross-Lingual Machine Speech Chain for Javanese, Sundanese, Balinese, and Bataks Speech Recognition and Synthesis

论文作者

Novitasari, Sashi, Tjandra, Andros, Sakti, Sakriani, Nakamura, Satoshi

论文摘要

即使在印度尼西亚使用了超过700种民族语言,但可用的技术仍然有限,可以支持土著社区内以及与村庄以外的人们进行交流。结果,由于文化障碍,土著社区仍然面临孤立。语言继续消失。为了加速沟通,语音到语音翻译(S2ST)技术是一种可以克服语言障碍的方法。但是,S2ST系统需要机器翻译(MT),语音识别(ASR)和综合(TTS),这些系统严重依赖于监督培训和广泛的语言资源,这些语言资源可能难以从种族社区中收集。最近,提出了一种机器语音链机制,以使ASR和TTs在半监督学习中相互帮助。该框架最初仅针对单语言实施。在这项研究中,我们专注于为这些印尼种族语言开发言语识别和综合:爪哇,圣丹尼斯,巴厘岛和巴达克人。我们首先在监督培训中分别培训标准印尼语的ASR和TT。然后,我们通过在跨语性的机器语音链框架中利用印尼ASR和TT来开发种族语言的ASR和TT,其中只有文本或仅语音数据消除了对这些种族语言的配对语音文本数据的需求。

Even though over seven hundred ethnic languages are spoken in Indonesia, the available technology remains limited that could support communication within indigenous communities as well as with people outside the villages. As a result, indigenous communities still face isolation due to cultural barriers; languages continue to disappear. To accelerate communication, speech-to-speech translation (S2ST) technology is one approach that can overcome language barriers. However, S2ST systems require machine translation (MT), speech recognition (ASR), and synthesis (TTS) that rely heavily on supervised training and a broad set of language resources that can be difficult to collect from ethnic communities. Recently, a machine speech chain mechanism was proposed to enable ASR and TTS to assist each other in semi-supervised learning. The framework was initially implemented only for monolingual languages. In this study, we focus on developing speech recognition and synthesis for these Indonesian ethnic languages: Javanese, Sundanese, Balinese, and Bataks. We first separately train ASR and TTS of standard Indonesian in supervised training. We then develop ASR and TTS of ethnic languages by utilizing Indonesian ASR and TTS in a cross-lingual machine speech chain framework with only text or only speech data removing the need for paired speech-text data of those ethnic languages.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源