论文标题

在Twitter上驯服混合云的快速可扩展图分析

Taming Hybrid-Cloud Fast and Scalable Graph Analytics at Twitter

论文作者

Tang, Chunxu, Li, Yao, Luo, Zhenxiao, Ghosh, Mainak, Wu, Huijun, Zhang, Lu, Lu, Anneliese, Kabra, Ruchin, Navadiya, Nikhil Kantibhai, Mishra, Prachi, Mukhedkar, Prateek, Channapattan, Vrushali

论文摘要

近年来,我们在Twitter上目睹了对图形分析的需求,并且Graph Analytics已成为Twitter大规模数据分析和机器学习的关键部分之一,以推动参与度,服务最相关的内容以及促进更健康的对话。但是,用于图形分析的基础架构在历史上并不是Twitter的投资领域,从而为每个项目提供了长时间的时间表和巨大的工程工作,以在Twitter量表上处理图形。我们如何构建统一的图形分析用户体验,以实现从数千到数十亿个顶点和边缘的各种图形量表上的现代数据分析? 为了将快速和可扩展的图形分析能力带入生产,我们在Twitter上研究了我们在大规模的图形分析中面临的挑战,并提出了一个统一的图形分析平台,以实现跨本地和云的高效,可扩展性和可靠的图形分析,以满足各种图形用例和挑战性尺度的要求。我们还对Twitter的生产级别图形案例进行了定量基准测试,以证明我们的解决方案。

We have witnessed a boosted demand for graph analytics at Twitter in recent years, and graph analytics has become one of the key parts of Twitter's large-scale data analytics and machine learning for driving engagement, serving the most relevant content, and promoting healthier conversations. However, infrastructure for graph analytics has historically not been an area of investment at Twitter, resulting in a long timeline and huge engineering effort for each project to deal with graphs at the Twitter scale. How do we build a unified graph analytics user experience to fulfill modern data analytics on various graph scales spanning from thousands to hundreds of billions of vertices and edges? To bring fast and scalable graph analytics capability into production, we investigate the challenges we are facing in large-scale graph analytics at Twitter and propose a unified graph analytics platform for efficient, scalable, and reliable graph analytics across on-premises and cloud, to fulfill the requirements of diverse graph use cases and challenging scales. We also conduct quantitative benchmarking on Twitter's production-level graph use cases between popular graph analytics frameworks to certify our solution.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源