几个单词的机器 - 互动扬声器的识别和加固学习

论文标题

几个单词的机器 - 互动扬声器的识别和加固学习

A Machine of Few Words -- Interactive Speaker Recognition with Reinforcement Learning

论文作者

Seurin, Mathieu, Strub, Florian, Preux, Philippe, Pietquin, Olivier

论文摘要

演讲者的认可是语音处理领域中的众所周知和研究的任务。它具有许多应用程序，无论是用于安全性或演讲者的个人设备的适应性。在本文中，我们提出了一种新的自动扬声器认可的范式，即我们称之为互动扬声器识别（ISR）。在此范式中，识别系统的目的是通过要求与标准的依赖文本或独立于文本独立的方案进行鲜明对比的个性化话语来逐步构建说话者的表示。为此，我们将演讲者的识别任务投入到一个顺序的决策问题中，并通过强化学习解决。使用标准数据集，我们表明我们的方法在使用很少的语音信号量的同时可以实现出色的性能。该方法也可以用作构建语音合成系统的话语选择机制。

Speaker recognition is a well known and studied task in the speech processing domain. It has many applications, either for security or speaker adaptation of personal devices. In this paper, we present a new paradigm for automatic speaker recognition that we call Interactive Speaker Recognition (ISR). In this paradigm, the recognition system aims to incrementally build a representation of the speakers by requesting personalized utterances to be spoken in contrast to the standard text-dependent or text-independent schemes. To do so, we cast the speaker recognition task into a sequential decision-making problem that we solve with Reinforcement Learning. Using a standard dataset, we show that our method achieves excellent performance while using little speech signal amounts. This method could also be applied as an utterance selection mechanism for building speech synthesis systems.

下载PDF全文

下载文献需遵守相关版权规定

论文标题