用胶囊网络进行多任务学习，用于语音到大的应用程序

论文标题

用胶囊网络进行多任务学习，用于语音到大的应用程序

Multitask Learning with Capsule Networks for Speech-to-Intent Applications

论文作者

Poncelet, Jakob, Van hamme, Hugo

论文摘要

语音控制的应用程序可以为社会提供巨大的帮助，尤其是对身体挑战的人们。但是，这需要对语音的各种变化的鲁棒性。从用户与用户的互动和演示中学习的口语理解系统，允许在不同的设置和不同类型的语音中使用此类系统，即使对于偏差或不良的语音，同时也允许用户选择措辞。用户给出了命令并通过接口进入其意图，然后该模型学会将语音直接映射到正确的操作。由于用户的努力应尽可能低，因此与更深的神经网络相比，胶囊网络可能需要很少的培训数据引起了兴趣。在本文中，我们展示了胶囊如何结合多任务学习，在任务困难时，通常可以改善模型的性能。基本的胶囊网络将通过正规化扩展，以在其输出中创建更多的结构：它通过将所需信息强加于胶囊向量来识别说话的说话者。为此，我们从依赖扬声器的演讲者转变为扬声器独立环境。

Voice controlled applications can be a great aid to society, especially for physically challenged people. However this requires robustness to all kinds of variations in speech. A spoken language understanding system that learns from interaction with and demonstrations from the user, allows the use of such a system in different settings and for different types of speech, even for deviant or impaired speech, while also allowing the user to choose a phrasing. The user gives a command and enters its intent through an interface, after which the model learns to map the speech directly to the right action. Since the effort of the user should be as low as possible, capsule networks have drawn interest due to potentially needing little training data compared to deeper neural networks. In this paper, we show how capsules can incorporate multitask learning, which often can improve the performance of a model when the task is difficult. The basic capsule network will be expanded with a regularisation to create more structure in its output: it learns to identify the speaker of the utterance by forcing the required information into the capsule vectors. To this end we move from a speaker dependent to a speaker independent setting.

下载PDF全文

下载文献需遵守相关版权规定

论文标题