夸张的纠正反馈的视觉语音综合

论文标题

夸张的纠正反馈的视觉语音综合

Visual-speech Synthesis of Exaggerated Corrective Feedback

论文作者

Bu, Yaohua, Li, Weijun, Ma, Tianyi, Chen, Shengqi, Jia, Jia, Li, Kun, Lu, Xiaobo

论文摘要

为了为第二语言（L2）学习者提供更多歧视性反馈，以更好地识别其错误发音，我们提出了一种在计算机辅助发音训练（Capt）中夸大视觉语音反馈的方法。语音夸大是通过基于TACOTRON的强调语音产生神经网络实现的，而视觉夸张是通过ADC Viseme混合而实现的，即增加运动的幅度，扩大了手机的持续时间并增强了颜色对比度。用户研究表明，夸张的反馈优于非夸张版本，在帮助具有发音识别和发音改进的学习者方面。

To provide more discriminative feedback for the second language (L2) learners to better identify their mispronunciation, we propose a method for exaggerated visual-speech feedback in computer-assisted pronunciation training (CAPT). The speech exaggeration is realized by an emphatic speech generation neural network based on Tacotron, while the visual exaggeration is accomplished by ADC Viseme Blending, namely increasing Amplitude of movement, extending the phone's Duration and enhancing the color Contrast. User studies show that exaggerated feedback outperforms non-exaggerated version on helping learners with pronunciation identification and pronunciation improvement.

下载PDF全文

下载文献需遵守相关版权规定

论文标题