隐私对抗网络：移动数据隐私的表示形式学习

论文标题

隐私对抗网络：移动数据隐私的表示形式学习

Privacy Adversarial Network: Representation Learning for Mobile Data Privacy

论文作者

Liu, Sicong, Du, Junzhao, Shrivastava, Anshumali, Zhong, Lin

论文摘要

机器学习的显着成功为移动用户提供了越来越多的基于云的智能服务。这样的服务要求用户发送数据，例如图像，语音和视频给提供商，这给用户隐私带来了严重的挑战。为了解决这个问题，先前的工作要么混淆数据，例如添加噪声并删除身份信息，或发送从数据中提取的表示形式，例如匿名功能。他们很难在服务实用程序和数据隐私之间取得平衡，因为混淆的数据减少了效用，并且提取的表示仍可能揭示敏感信息。这项工作偏离了方法论的先前工作：我们利用对抗性学习，以在隐私和公用事业之间取得更好的平衡。我们设计了一个\ textIt {表示编码}，该{表示形式为\ textIt {privical oversaries}的隐私披露风险（一种隐私范围）进行优化，并与任务推理精度（效用量）同时优化\ textit {textit fextit {textit {textit of Utility）。结果是隐私对抗网络（\ SystemName），这是一种具有新培训算法的新型深层模型，可以自动从原始数据中学习表示形式。直觉上，PAN对手迫使提取的表示仅传达目标任务所需的信息。令人惊讶的是，这构成了一个隐含的正则化，实际上提高了任务准确性。结果，Pan同时实现了更好的实用性和更好的隐私！我们在六个流行数据集上报告了广泛的实验，并证明了\ SystemName的优势与先前的工作中报道的替代方法相比。

The remarkable success of machine learning has fostered a growing number of cloud-based intelligent services for mobile users. Such a service requires a user to send data, e.g. image, voice and video, to the provider, which presents a serious challenge to user privacy. To address this, prior works either obfuscate the data, e.g. add noise and remove identity information, or send representations extracted from the data, e.g. anonymized features. They struggle to balance between the service utility and data privacy because obfuscated data reduces utility and extracted representation may still reveal sensitive information. This work departs from prior works in methodology: we leverage adversarial learning to a better balance between privacy and utility. We design a \textit{representation encoder} that generates the feature representations to optimize against the privacy disclosure risk of sensitive information (a measure of privacy) by the \textit{privacy adversaries}, and concurrently optimize with the task inference accuracy (a measure of utility) by the \textit{utility discriminator}. The result is the privacy adversarial network (\systemname), a novel deep model with the new training algorithm, that can automatically learn representations from the raw data. Intuitively, PAN adversarially forces the extracted representations to only convey the information required by the target task. Surprisingly, this constitutes an implicit regularization that actually improves task accuracy. As a result, PAN achieves better utility and better privacy at the same time! We report extensive experiments on six popular datasets and demonstrate the superiority of \systemname compared with alternative methods reported in prior work.

下载PDF全文

下载文献需遵守相关版权规定

论文标题