论文标题

语音转换挑战2020的NetaseGames系统,带有矢量定量变化自动编码器和WaveNet

The NeteaseGames System for Voice Conversion Challenge 2020 with Vector-quantization Variational Autoencoder and WaveNet

论文作者

Zhang, Haitong

论文摘要

本文介绍了我们提交的语音转换挑战系统(VCC)2020的描述,其中包括矢量定量的变化自动编码器(VQ-VAE),其中WaveNet作为解码器,即VQ-VAE-Wavenet。 VQ-VAE-Wavenet是一种基于VAE的语音转换,可重建声学特征,并用扬声器身份将语言信息分开。随着波纳特循环为解码器,该模型进一步改进了高质量的语音波形,因为WaveNet作为自回归神经声码器,已经达到了波形产生的SOTA结果。实际上,我们的系统可以使用VCC 2020数据集开发,用于任务1(语言内)和任务2(跨语义)。但是,我们仅提交系统内语音转换任务。 VCC 2020的结果表明,我们的系统VQ-VAE-波纳特可以实现:3.04自然性的平均意见评分(MOS)和任务1的相似性(SIM)相似性百分比(SIM)为3.28的平均得分。更重要的是,我们的系统在一些客观评估中表现良好。具体而言,我们的系统在自动自然性预测中的自然性平均得分为3.95,并分别在基于ASV的扬声器相似性和欺骗对策中排名第6和8。

This paper presents the description of our submitted system for Voice Conversion Challenge (VCC) 2020 with vector-quantization variational autoencoder (VQ-VAE) with WaveNet as the decoder, i.e., VQ-VAE-WaveNet. VQ-VAE-WaveNet is a nonparallel VAE-based voice conversion that reconstructs the acoustic features along with separating the linguistic information with speaker identity. The model is further improved with the WaveNet cycle as the decoder to generate the high-quality speech waveform, since WaveNet, as an autoregressive neural vocoder, has achieved the SoTA result of waveform generation. In practice, our system can be developed with VCC 2020 dataset for both Task 1 (intra-lingual) and Task 2 (cross-lingual). However, we only submit our system for the intra-lingual voice conversion task. The results of VCC 2020 demonstrate that our system VQ-VAE-WaveNet achieves: 3.04 mean opinion score (MOS) in naturalness and a 3.28 average score in similarity ( the speaker similarity percentage (Sim) of 75.99%) for Task 1. The subjective evaluations also reveal that our system gives top performance when no supervised learning is involved. What's more, our system performs well in some objective evaluations. Specifically, our system achieves an average score of 3.95 in naturalness in automatic naturalness prediction and ranked the 6th and 8th, respectively in ASV-based speaker similarity and spoofing countermeasures.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源