弥合学术界与行业之间的亲笔签名差距：分析KDD Cup 2020的签名挑战

论文标题

弥合学术界与行业之间的亲笔签名差距：分析KDD Cup 2020的签名挑战

Bridging the Gap of AutoGraph between Academia and Industry: Analysing AutoGraph Challenge at KDD Cup 2020

论文作者

Xu, Zhen, Wei, Lanning, Zhao, Huan, Ying, Rex, Yao, Quanming, Tu, Wei-Wei, Guyon, Isabelle

论文摘要

图形结构化数据在日常生活和科学领域无处不在，引起了人们越来越多的关注。事实证明，图形神经网络（GNN）在建模图结构化数据方面是有效的，并且已经提出了许多GNN架构的变体。但是，通常需要大量的人力来根据不同的数据集进行架构调整。研究人员自然会在图形学习上采用自动化的机器学习，旨在减少人类的努力并达到总体表现最佳的GNN，但他们的方法更多地集中在建筑搜索上。为了了解GNN从业人员的自动解决方案，我们在KDD Cup 2020组织了签名挑战，强调了用于节点分类的自动化图神经网络。我们收到了最佳解决方案，尤其是来自Meituan，Alibaba和Twitter等工业技术公司，这些公司已经在Github上开放。 After detailed comparisons with solutions from academia, we quantify the gaps between academia and industry on modeling scope, effectiveness and efficiency, and show that (1) academia AutoML for Graph solutions focus on GNN architecture search while industrial solutions, especially the winning ones in the KDD Cup, tend to obtain an overall solution (2) by neural architecture search only, academia solutions achieve on average 97.3% accuracy of industrial solutions （3）几个小时的学术界解决方案便宜，而工业解决方案需要几个月的工作。学术解决方案还包含更少的参数。

Graph structured data is ubiquitous in daily life and scientific areas and has attracted increasing attention. Graph Neural Networks (GNNs) have been proved to be effective in modeling graph structured data and many variants of GNN architectures have been proposed. However, much human effort is often needed to tune the architecture depending on different datasets. Researchers naturally adopt Automated Machine Learning on Graph Learning, aiming to reduce the human effort and achieve generally top-performing GNNs, but their methods focus more on the architecture search. To understand GNN practitioners' automated solutions, we organized AutoGraph Challenge at KDD Cup 2020, emphasizing on automated graph neural networks for node classification. We received top solutions especially from industrial tech companies like Meituan, Alibaba and Twitter, which are already open sourced on Github. After detailed comparisons with solutions from academia, we quantify the gaps between academia and industry on modeling scope, effectiveness and efficiency, and show that (1) academia AutoML for Graph solutions focus on GNN architecture search while industrial solutions, especially the winning ones in the KDD Cup, tend to obtain an overall solution (2) by neural architecture search only, academia solutions achieve on average 97.3% accuracy of industrial solutions (3) academia solutions are cheap to obtain with several GPU hours while industrial solutions take a few months' labors. Academic solutions also contain much fewer parameters.

下载PDF全文

下载文献需遵守相关版权规定

论文标题