论文标题
垂直联合随机森林的高效且健壮的系统
An Efficient and Robust System for Vertically Federated Random Forest
论文作者
论文摘要
由于对利用跨多种资源的数据建立更好的机器学习模型的兴趣越来越大,因此已经提出了许多垂直联合的学习算法来保护参与组织的数据隐私。但是,现有的垂直联合学习算法的效率仍然是一个大问题,尤其是应用于大型现实世界数据集时。在本文中,我们为垂直联合的随机森林提供了一个快速,准确,可扩展但可靠的系统。通过广泛的优化,我们在Sota Secureboost模型\ Cite {Cheng2019secureboost}方面实现了$ 5 \ times $,$ 83 \ times $加快培训和服务任务。此外,提出的系统可以达到相似的精度,但具有良好的可伸缩性和分区耐受性。我们的代码已公开以促进社区的发展和用户数据隐私的保护。
As there is a growing interest in utilizing data across multiple resources to build better machine learning models, many vertically federated learning algorithms have been proposed to preserve the data privacy of the participating organizations. However, the efficiency of existing vertically federated learning algorithms remains to be a big problem, especially when applied to large-scale real-world datasets. In this paper, we present a fast, accurate, scalable and yet robust system for vertically federated random forest. With extensive optimization, we achieved $5\times$ and $83\times$ speed up over the SOTA SecureBoost model \cite{cheng2019secureboost} for training and serving tasks. Moreover, the proposed system can achieve similar accuracy but with favorable scalability and partition tolerance. Our code has been made public to facilitate the development of the community and the protection of user data privacy.