论文标题
didispeech:大规模的普通话语料库
DiDiSpeech: A Large Scale Mandarin Speech Corpus
论文作者
论文摘要
本文介绍了一个新的开源普通话语料库,称为didispeech。它由6000名扬声器和相应文本的48kHz采样率的大约800小时的语音数据组成。语料库中的所有语音数据都记录在安静的环境中,适用于各种语音处理任务,例如语音转换,多演讲者文本到语音和自动语音识别。我们通过多个语音任务进行实验,并评估表现,表明它有望将其用于学术研究和实际应用。该语料库可在https://outreach.didichuxing.com/research/opendata/上获得。
This paper introduces a new open-sourced Mandarin speech corpus, called DiDiSpeech. It consists of about 800 hours of speech data at 48kHz sampling rate from 6000 speakers and the corresponding texts. All speech data in the corpus is recorded in quiet environment and is suitable for various speech processing tasks, such as voice conversion, multi-speaker text-to-speech and automatic speech recognition. We conduct experiments with multiple speech tasks and evaluate the performance, showing that it is promising to use the corpus for both academic research and practical application. The corpus is available at https://outreach.didichuxing.com/research/opendata/.