论文标题
SIMC 2.0:改善针对恶意客户的安全ML推断
SIMC 2.0: Improved Secure ML Inference Against Malicious Clients
论文作者
论文摘要
在本文中,我们研究了针对恶意客户端和半信任服务器的安全ML推断的问题,因此客户仅在服务器什么都没有学习时才学习推理输出。该问题首先由Lehmkuhl \ textit {等}提出,并用解决方案(Muse,Usenix Security'21)提出,然后Chandran等人的作品(Simc,Usenix Security'22)将其性能显着提高。但是,在实用性方面的这些努力中,仍然存在非平凡的差距,以全方位的方式挑战了间接费用的挑战和安全的推理加速。 我们提出的SIMC 2.0符合SIMC的基础结构,但显着优化了模型的线性和非线性层。具体而言,(1)我们设计了一种新的编码方法,用于矩阵和向量之间的同形并行计算。通过深入了解SIMC中的加密原语之间的互补性,它是定制的。结果,它可以最大程度地减少计算过程中发生的旋转操作数量,与其他同构操作相比,它在计算过程中非常昂贵,例如添加,乘法)。 (2)我们将SIMC中的乱式电路(GC)的尺寸降低了SIMC中的非线性激活功能,例如Relu)大约三分之二。然后,我们设计了一种替代轻型协议,以执行最初分配给昂贵GC的任务。与SIMC相比,我们的实验表明,对于线性层计算,SIMC 2.0的速度最高为$ 17.4 \ times $,至少$ 1.3 \ times $减少了在不同数据维度下实施非线性层的计算和通信开销。同时,SIMC 2.0在不同的最先进的ML型号上表现出令人鼓舞的运行时提升$ 2.3 \ sim 4.3 \ times $。
In this paper, we study the problem of secure ML inference against a malicious client and a semi-trusted server such that the client only learns the inference output while the server learns nothing. This problem is first formulated by Lehmkuhl \textit{et al.} with a solution (MUSE, Usenix Security'21), whose performance is then substantially improved by Chandran et al.'s work (SIMC, USENIX Security'22). However, there still exists a nontrivial gap in these efforts towards practicality, giving the challenges of overhead reduction and secure inference acceleration in an all-round way. We propose SIMC 2.0, which complies with the underlying structure of SIMC, but significantly optimizes both the linear and non-linear layers of the model. Specifically, (1) we design a new coding method for homomorphic parallel computation between matrices and vectors. It is custom-built through the insight into the complementarity between cryptographic primitives in SIMC. As a result, it can minimize the number of rotation operations incurred in the calculation process, which is very computationally expensive compared to other homomorphic operations e.g., addition, multiplication). (2) We reduce the size of the garbled circuit (GC) (used to calculate nonlinear activation functions, e.g., ReLU) in SIMC by about two thirds. Then, we design an alternative lightweight protocol to perform tasks that are originally allocated to the expensive GCs. Compared with SIMC, our experiments show that SIMC 2.0 achieves a significant speedup by up to $17.4\times $ for linear layer computation, and at least $1.3\times$ reduction of both the computation and communication overheads in the implementation of non-linear layers under different data dimensions. Meanwhile, SIMC 2.0 demonstrates an encouraging runtime boost by $2.3\sim 4.3\times$ over SIMC on different state-of-the-art ML models.