论文标题
AVQVC:通过使用对比度学习,通过向量量化进行单发语音转换
AVQVC: One-shot Voice Conversion by Vector Quantization with applying contrastive learning
论文作者
论文摘要
语音转换(VC)是指在保留话语内容的同时更改演讲的音色。最近,许多作品都集中在基于解开的学习技术上,以将音色和语言内容信息与语音信号分开。一旦成功,语音转换将是可行而直接的。本文提出了一个基于矢量量化语音转换(VQVC)和AUTOVC的新颖的单发语音转换框架,称为AVQVC。将一种新的培训方法应用于VQVC,以更有效地将内容和音色信息与语音分开。结果表明,这种方法在分离内容和音色以提高产生的语音的声音质量时具有比VQVC更好的性能。
Voice Conversion(VC) refers to changing the timbre of a speech while retaining the discourse content. Recently, many works have focused on disentangle-based learning techniques to separate the timbre and the linguistic content information from a speech signal. Once successful, voice conversion will be feasible and straightforward. This paper proposed a novel one-shot voice conversion framework based on vector quantization voice conversion (VQVC) and AutoVC, called AVQVC. A new training method is applied to VQVC to separate content and timbre information from speech more effectively. The result shows that this approach has better performance than VQVC in separating content and timbre to improve the sound quality of generated speech.