论文标题

在中文命名实体识别中的数据加密应用

Application of Data Encryption in Chinese Named Entity Recognition

论文作者

Long, Kaifang, Dong, Jikun, Fan, Shengyu, Geng, Yanfang, Cao, Yang, Zhao, Han, Yu, Hui, Xu, Weizhi

论文摘要

最近,随着深度学习的持续发展,指定的实体识别任务的表现得到了极大的改进。但是,在某些特定领域(例如生物医学和军事)中数据的隐私和机密性导致数据不足以支持深神经网络的培训。在本文中,我们提出了一个加密学习框架,以解决数据泄漏的问题以及对某些域中敏感数据的不便披露。我们首次将多个加密算法介绍以在命名实体识别任务中加密培训数据。换句话说,我们使用加密数据训练深神网络。我们在六个中国数据集上进行实验,其中三个是由我们自己构建的。实验结果表明,加密方法可实现令人满意的结果。一些经过加密数据训练的模型的性能甚至超过了未加密方法的性能,该方法验证了引入的加密方法的有效性,并在一定程度上解决了数据泄漏问题。

Recently, with the continuous development of deep learning, the performance of named entity recognition tasks has been dramatically improved. However, the privacy and the confidentiality of data in some specific fields, such as biomedical and military, cause insufficient data to support the training of deep neural networks. In this paper, we propose an encryption learning framework to address the problems of data leakage and inconvenient disclosure of sensitive data in certain domains. We introduce multiple encryption algorithms to encrypt training data in the named entity recognition task for the first time. In other words, we train the deep neural network using the encrypted data. We conduct experiments on six Chinese datasets, three of which are constructed by ourselves. The experimental results show that the encryption method achieves satisfactory results. The performance of some models trained with encrypted data even exceeds the performance of the unencrypted method, which verifies the effectiveness of the introduced encryption method and solves the problem of data leakage to a certain extent.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源