论文标题
基于K-Modes算法的医院居民作业的新型初始化
A novel initialisation based on hospital-resident assignment for the k-modes algorithm
论文作者
论文摘要
本文提出了一种选择K-Modes算法初始解决方案的新方法,该算法允许数学公平概念以及文献中常见初始化所无法的数据的杠杆作用。该方法利用医院居民分配问题来找到一组初始集群质心,并将其与基准数据集和新生成的人工数据集的当前初始化进行了比较。基于此分析,所提出的方法显示出在大多数情况下的其他初始化的效果,尤其是在优化簇数时。此外,我们发现我们的方法优于专门针对低密度数据的领先建立方法。
This paper presents a new way of selecting an initial solution for the k-modes algorithm that allows for a notion of mathematical fairness and a leverage of the data that the common initialisations from literature do not. The method, which utilises the Hospital-Resident Assignment Problem to find the set of initial cluster centroids, is compared with the current initialisations on both benchmark datasets and a body of newly generated artificial datasets. Based on this analysis, the proposed method is shown to outperform the other initialisations in the majority of cases, especially when the number of clusters is optimised. In addition, we find that our method outperforms the leading established method specifically for low-density data.