卷积神经网络中的任务不变性的反转移学习用于语音处理

论文标题

卷积神经网络中的任务不变性的反转移学习用于语音处理

Anti-Transfer Learning for Task Invariance in Convolutional Neural Networks for Speech Processing

论文作者

Guizzo, Eric, Weyde, Tillman, Tarroni, Giacomo

论文摘要

我们介绍了使用卷积神经网络进行语音处理的反转移学习的新颖概念。尽管转移学习假设目标任务的学习过程将受益于重复使用针对另一任务的代表性，但反转移避免学习对正交任务所学的表示形式，即，对于目标任务而言无关的且潜在地误导目标，例如用于语音识别或语音识别情感内容的扬声器身份。在反转移学习中，我们惩罚了受过训练的网络激活与以前接受过正交任务训练的网络激活之间的相似性，这会产生更合适的表示。这会导致更好的概括，并对虚假或不良的相关性提供一定程度的控制，例如避免社会偏见。我们已经在不同的配置中实施了具有几种相似性指标和聚合功能的不同配置中的卷积神经网络的反转移，我们使用六个数据集对几个语音和音频任务和设置进行了评估和分析。我们表明，反转移实际上会导致对正交任务的预期不变性，并为目前的目标任务提供了更合适的功能。反转移学习始终提高所有测试用例的分类精度。尽管反转移会在训练时间创建计算和内存成本，但使用预训练的模型进行正交任务时，计算成本相对较少。反转移是广泛适用的，并且在需要特定的不变性或可用训练的模型并获得正交任务的标记数据的地方，并且很难获得。

We introduce the novel concept of anti-transfer learning for speech processing with convolutional neural networks. While transfer learning assumes that the learning process for a target task will benefit from re-using representations learned for another task, anti-transfer avoids the learning of representations that have been learned for an orthogonal task, i.e., one that is not relevant and potentially misleading for the target task, such as speaker identity for speech recognition or speech content for emotion recognition. In anti-transfer learning, we penalize similarity between activations of a network being trained and another one previously trained on an orthogonal task, which yields more suitable representations. This leads to better generalization and provides a degree of control over correlations that are spurious or undesirable, e.g. to avoid social bias. We have implemented anti-transfer for convolutional neural networks in different configurations with several similarity metrics and aggregation functions, which we evaluate and analyze with several speech and audio tasks and settings, using six datasets. We show that anti-transfer actually leads to the intended invariance to the orthogonal task and to more appropriate features for the target task at hand. Anti-transfer learning consistently improves classification accuracy in all test cases. While anti-transfer creates computation and memory cost at training time, there is relatively little computation cost when using pre-trained models for orthogonal tasks. Anti-transfer is widely applicable and particularly useful where a specific invariance is desirable or where trained models are available and labeled data for orthogonal tasks are difficult to obtain.

下载PDF全文

下载文献需遵守相关版权规定

论文标题