如何选择最合适的中心度度量？决策树方法

论文标题

如何选择最合适的中心度度量？决策树方法

How to choose the most appropriate centrality measure? A decision tree approach

论文作者

Chebotarev, Pavel, Gubanov, Dmitry

论文摘要

中心度指标在网络分析中起着至关重要的作用，而特定度量的选择显着影响结论的准确性，因为每种度量代表了节点重要性的独特概念。在超过400个拟议的指数中，选择最合适的特定应用指数仍然是一个挑战。现有方法 - 基于模型，数据驱动和公理 - 有局限性，需要与每个特定应用程序与模型，培训数据集或限制性公理相关联。为了解决这个问题，我们介绍了摘除方法，该方法依赖于简单图表上的中心行为的专家概念。淘汰方法涉及形成一组候选措施，生成尽可能小的图表，以区分措施，构建决策树调查并确定与专家概念一致的措施。我们将这种方法应用于40个多样化的中心集，包括基于内核的新指数，并将其与公理方法相结合。值得注意的是，即使对于紧密相关的措施，也只有13个小1树可以分开所有40个措施。通过采用简单的序数公理，例如自通矛盾或桥梁公理，可以大大减少一组措施，从而使淘汰调查很短。应用剔除方法为某些中心性指数（例如Pagerank，桥接和基于相似性的特征性措施等）提供了有见地的发现。拟议的方法在劳动和时间方面提供了一种具有成本效益的解决方案，补充了现有的措施选择方法，并提供了对中心度测量的潜在机制的更深入的见解。

Centrality metrics play a crucial role in network analysis, while the choice of specific measures significantly influences the accuracy of conclusions as each measure represents a unique concept of node importance. Among over 400 proposed indices, selecting the most suitable ones for specific applications remains a challenge. Existing approaches -- model-based, data-driven, and axiomatic -- have limitations, requiring association with models, training datasets, or restrictive axioms for each specific application. To address this, we introduce the culling method, which relies on the expert concept of centrality behavior on simple graphs. The culling method involves forming a set of candidate measures, generating a list of as small graphs as possible needed to distinguish the measures from each other, constructing a decision-tree survey, and identifying the measure consistent with the expert's concept. We apply this approach to a diverse set of 40 centralities, including novel kernel-based indices, and combine it with the axiomatic approach. Remarkably, only 13 small 1-trees are sufficient to separate all 40 measures, even for pairs of closely related ones. By adopting simple ordinal axioms like Self-consistency or Bridge axiom, the set of measures can be drastically reduced making the culling survey short. Applying the culling method provides insightful findings on some centrality indices, such as PageRank, Bridging, and dissimilarity-based Eigencentrality measures, among others. The proposed approach offers a cost-effective solution in terms of labor and time, complementing existing methods for measure selection, and providing deeper insights into the underlying mechanisms of centrality measures.

下载PDF全文

下载文献需遵守相关版权规定

论文标题