私人，高效且准确：通过具有不同隐私的多方学习训练的模型

论文标题

私人，高效且准确：通过具有不同隐私的多方学习训练的模型

Private, Efficient, and Accurate: Protecting Models Trained by Multi-party Learning with Differential Privacy

论文作者

Ruan, Wenqiang, Xu, Mingxin, Fang, Wenjing, Wang, Li, Wang, Lei, Han, Weili

论文摘要

安全的基于多方计算的机器学习（称为MPL）已成为利用具有隐私保护的多个方面数据的重要技术。尽管MPL为计算过程提供了严格的安全保证，但由MPL训练的模型仍然容易受到仅取决于访问模型的攻击。差异隐私可以帮助防御此类攻击。但是，差异隐私和安全多方计算协议的巨大沟通开销带来的准确性损失使得平衡隐私，效率和准确性之间的三通权衡是高度挑战的。在本文中，我们有动力通过提出一种解决方案（称为PEA（私有，高效，准确））来解决上述问题，该解决方案由安全的DPSGD协议和两种优化方法组成。首先，我们提出了一个安全的DPSGD协议，以在基于秘密共享的MPL框架中执行DPSGD。其次，为了减少因差异隐私噪声和MPL的巨大通信开销而导致的准确性损失，我们提出了MPL训练过程的两种优化方法：（1）与数据无关的特征提取方法，旨在简化受过训练的模型结构；（2）基于本地数据的全局模型初始化方法，旨在加快模型训练的收敛性。我们在两个开源MPL框架中实施PEA：TF-Conteded和Queqiao。各种数据集的实验结果证明了PEA的效率和有效性。例如。当$ε$ = 2时，我们可以在LAN设置下的7分钟内训练CIFAR-10的差异私有分类模型，其精度为88％。这一结果大大优于一个来自Cryptgpu的SOTA MPL框架的结果：在CIFAR-10上训练非私有性深神经网络模型的成本超过16小时，其精度相同。

Secure multi-party computation-based machine learning, referred to as MPL, has become an important technology to utilize data from multiple parties with privacy preservation. While MPL provides rigorous security guarantees for the computation process, the models trained by MPL are still vulnerable to attacks that solely depend on access to the models. Differential privacy could help to defend against such attacks. However, the accuracy loss brought by differential privacy and the huge communication overhead of secure multi-party computation protocols make it highly challenging to balance the 3-way trade-off between privacy, efficiency, and accuracy. In this paper, we are motivated to resolve the above issue by proposing a solution, referred to as PEA (Private, Efficient, Accurate), which consists of a secure DPSGD protocol and two optimization methods. First, we propose a secure DPSGD protocol to enforce DPSGD in secret sharing-based MPL frameworks. Second, to reduce the accuracy loss led by differential privacy noise and the huge communication overhead of MPL, we propose two optimization methods for the training process of MPL: (1) the data-independent feature extraction method, which aims to simplify the trained model structure; (2) the local data-based global model initialization method, which aims to speed up the convergence of the model training. We implement PEA in two open-source MPL frameworks: TF-Encrypted and Queqiao. The experimental results on various datasets demonstrate the efficiency and effectiveness of PEA. E.g. when $ε$ = 2, we can train a differentially private classification model with an accuracy of 88% for CIFAR-10 within 7 minutes under the LAN setting. This result significantly outperforms the one from CryptGPU, one SOTA MPL framework: it costs more than 16 hours to train a non-private deep neural network model on CIFAR-10 with the same accuracy.

下载PDF全文

下载文献需遵守相关版权规定

论文标题