论文标题

具有分层继承关系的信息网络上的可扩展TOP-K查询

Scalable Top-k Query on Information Networks with Hierarchical Inheritance Relations

论文作者

Wu, Fubao, Gao, Lixin

论文摘要

图形查询,模式挖掘和知识发现在大规模异构信息网络(HINS)上变得具有挑战性。涉及路径传播的最新技术主要集中于节点标签和邻域结构的推断。但是,现实世界中的实体链接还包含丰富的层次继承关系。例如,产品版本的脆弱性可能是从其较旧版本继承的。利用层次继承可以提高查询结果的质量。在此激励的情况下,我们探讨了实体之间的层次结构遗传关系,并在层次结构遗传关系上提出图形查询问题。我们通过将原始查询图分解为多个星形查询,并将星形查询算法分解为每个星形查询来提出图形查询搜索算法。然后构建来自每个星星查询结果的更多候选人,以获取原始查询的最终Top-K查询答案。为了有效地从大型HIN获得图形查询结果,我们通过使用统一的成本搜索来修剪搜索空间,设计一种基于界限的修剪技术。我们在GraphX中实现算法,以测试合成和现实数据集的有效性和效率。与两种常见的图查询算法相比,我们的算法可以有效地获得更准确的结果和竞争性能。

Graph query, pattern mining and knowledge discovery become challenging on large-scale heterogeneous information networks (HINs). State-of-the-art techniques involving path propagation mainly focus on the inference on nodes labels and neighborhood structures. However, entity links in the real world also contain rich hierarchical inheritance relations. For example, the vulnerability of a product version is likely to be inherited from its older versions. Taking advantage of the hierarchical inheritances can potentially improve the quality of query results. Motivated by this, we explore hierarchical inheritance relations between entities and formulate the problem of graph query on HINs with hierarchical inheritance relations. We propose a graph query search algorithm by decomposing the original query graph into multiple star queries and apply a star query algorithm to each star query. Further candidates from each star query result are then constructed for final top-k query answers to the original query. To efficiently obtain the graph query result from a large-scale HIN, we design a bound-based pruning technique by using uniform cost search to prune search spaces. We implement our algorithm in GraphX to test the effectiveness and efficiency on synthetic and real-world datasets. Compared with two common graph query algorithms, our algorithm can effectively obtain more accurate results and competitive performances.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源