论文标题

FIVE:通过边缘搜索大规模表格数据的功能交互

FIVES: Feature Interaction Via Edge Search for Large-Scale Tabular Data

论文作者

Xie, Yuexiang, Wang, Zhen, Li, Yaliang, Ding, Bolin, Gürel, Nezihe Merve, Zhang, Ce, Huang, Minlie, Lin, Wei, Zhou, Jingren

论文摘要

高阶互动特征捕获了不同列之间的相关性,因此有望增强无处不在的表格数据的各种学习任务。为了自动化交互式特征的生成,现有作品要么明确穿越特征空间,要么通过某些设计模型的中间激活暗示交互。这两种方法表明,特征可解释性和搜索效率之间基本上存在权衡。为了拥有它们的两个优点,我们提出了一种通过Edge Search(Fives)的新颖方法,称为特征交互(FIVES),该方法将交互式特征生成的任务作为搜索定义的特征图上的边缘进行了搜索。具体来说,我们首先提供了我们的理论证据,该证据促使我们以增加的顺序搜索有用的交互作用。然后,我们通过优化专用图形神经网络(GNN)和与定义的特征图关联的邻接张量来实例化此搜索策略。通过这种方式,提出的五个方法简化了耗时的遍历作为GNN的典型培训课程,并根据学习的邻接张量可以显式发电。基准和现实世界数据集的实验结果表明,五杆的优势比几种最先进的方法。此外,由五杆识别的交互式功能部署在Toobao的推荐系统上,Toobao是全球领先的电子商务平台。在线A/B测试的结果进一步验证了提出的方法的效果,我们进一步为阿里巴巴云客户提供了五个五个五件事。

High-order interactive features capture the correlation between different columns and thus are promising to enhance various learning tasks on ubiquitous tabular data. To automate the generation of interactive features, existing works either explicitly traverse the feature space or implicitly express the interactions via intermediate activations of some designed models. These two kinds of methods show that there is essentially a trade-off between feature interpretability and search efficiency. To possess both of their merits, we propose a novel method named Feature Interaction Via Edge Search (FIVES), which formulates the task of interactive feature generation as searching for edges on the defined feature graph. Specifically, we first present our theoretical evidence that motivates us to search for useful interactive features with increasing order. Then we instantiate this search strategy by optimizing both a dedicated graph neural network (GNN) and the adjacency tensor associated with the defined feature graph. In this way, the proposed FIVES method simplifies the time-consuming traversal as a typical training course of GNN and enables explicit feature generation according to the learned adjacency tensor. Experimental results on both benchmark and real-world datasets show the advantages of FIVES over several state-of-the-art methods. Moreover, the interactive features identified by FIVES are deployed on the recommender system of Taobao, a worldwide leading e-commerce platform. Results of an online A/B testing further verify the effectiveness of the proposed method FIVES, and we further provide FIVES as AI utilities for the customers of Alibaba Cloud.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源