论文标题
我们可以在1D CNN模型上使用拆分学习来保存隐私培训吗?
Can We Use Split Learning on 1D CNN Models for Privacy Preserving Training?
论文作者
论文摘要
最近引入了一种新的称为Split Learning的合作学习,旨在保护用户数据隐私而不向服务器揭示原始输入数据。它合作运行了一个深度的神经网络模型,该模型分为两个部分,一个分为客户,另一个用于服务器。因此,服务器无法直接访问客户端处理的原始数据。到目前为止,分裂学习被认为是保护客户原始数据的一种有前途的方法。例如,使用2D卷积神经网络(CNN)模型在医疗保健图像应用中保护了客户数据。但是,尚不清楚分裂学习是否可以应用于其他深度学习模型,尤其是1D CNN。 在本文中,我们检查了是否可以使用分裂学习来为1D CNN模型执行隐私培训。为了回答这一点,我们首先在分裂学习下设计并实施了1D CNN模型,并使用医疗ECG数据验证其在检测心脏异常方面的功效。我们观察到,在分裂学习下的1D CNN模型可以像原始(非切片)模型一样达到98.9 \%的相同精度。但是,我们的评估表明,分裂学习可能无法保护1D CNN模型上的原始数据隐私。为了解决分裂学习中观察到的隐私泄漏,我们采用了两种隐私泄漏缓解技术:1)向客户端添加更多隐藏层,以及2)应用差异隐私。尽管这些缓解技术有助于减少隐私泄漏,但它们对模型准确性有重大影响。因此,根据这些结果,我们得出的结论是,单独的分裂学习是不足以维持一维CNN模型中原始顺序数据的机密性。
A new collaborative learning, called split learning, was recently introduced, aiming to protect user data privacy without revealing raw input data to a server. It collaboratively runs a deep neural network model where the model is split into two parts, one for the client and the other for the server. Therefore, the server has no direct access to raw data processed at the client. Until now, the split learning is believed to be a promising approach to protect the client's raw data; for example, the client's data was protected in healthcare image applications using 2D convolutional neural network (CNN) models. However, it is still unclear whether the split learning can be applied to other deep learning models, in particular, 1D CNN. In this paper, we examine whether split learning can be used to perform privacy-preserving training for 1D CNN models. To answer this, we first design and implement an 1D CNN model under split learning and validate its efficacy in detecting heart abnormalities using medical ECG data. We observed that the 1D CNN model under split learning can achieve the same accuracy of 98.9\% like the original (non-split) model. However, our evaluation demonstrates that split learning may fail to protect the raw data privacy on 1D CNN models. To address the observed privacy leakage in split learning, we adopt two privacy leakage mitigation techniques: 1) adding more hidden layers to the client side and 2) applying differential privacy. Although those mitigation techniques are helpful in reducing privacy leakage, they have a significant impact on model accuracy. Hence, based on those results, we conclude that split learning alone would not be sufficient to maintain the confidentiality of raw sequential data in 1D CNN models.