私人语音生成的对抗表示学习

论文标题

私人语音生成的对抗表示学习

Adversarial representation learning for private speech generation

论文作者

Ericsson, David, Östberg, Adam, Zec, Edvin Listo, Martinsson, John, Mogren, Olof

论文摘要

随着越来越多的数据在各种组织，公司和国家 /地区的各种环境中收集，对用户隐私的需求增加了。因此，开发数据分析的隐私方法是研究的重要领域。在这项工作中，我们提出了一个基于生成对抗网络（GAN）的模型，该模型学会了在语音数据中混淆特定的敏感属性。我们训练一个模型，该模型学会在数据中隐藏敏感信息，同时保留话语中的含义。该模型分为两个步骤：首先在频谱图域中过滤敏感信息，然后生成独立于过滤的信息。该模型基于将MEL光谱图作为输入的U-NET CNN。梅尔根用于将频谱图倒回原始音频波形。我们表明，可以通过生成新数据，对对手进行对抗以维持效用和现实主义来隐藏诸如性别之类的敏感信息。

As more and more data is collected in various settings across organizations, companies, and countries, there has been an increase in the demand of user privacy. Developing privacy preserving methods for data analytics is thus an important area of research. In this work we present a model based on generative adversarial networks (GANs) that learns to obfuscate specific sensitive attributes in speech data. We train a model that learns to hide sensitive information in the data, while preserving the meaning in the utterance. The model is trained in two steps: first to filter sensitive information in the spectrogram domain, and then to generate new and private information independent of the filtered one. The model is based on a U-Net CNN that takes mel-spectrograms as input. A MelGAN is used to invert the spectrograms back to raw audio waveforms. We show that it is possible to hide sensitive information such as gender by generating new data, trained adversarially to maintain utility and realism.

下载PDF全文

下载文献需遵守相关版权规定

论文标题