论文标题
薄荷:基于MDL的挖掘方法有趣的数值模式集
Mint: MDL-based approach for Mining INTeresting Numerical Pattern Sets
论文作者
论文摘要
模式挖掘在数据挖掘研究中已经很好地确定,尤其是用于采矿二进制数据集。令人惊讶的是,关于数值挖掘的工作要少得多,并且该研究领域的探索仍然不足。在本文中,我们提出了MINT,这是一种基于MDL的有效算法,用于采矿数值数据集。 MDL原理是一个可靠且可靠的框架,广泛用于模式挖掘,以及亚组发现中。在MINT中,我们将MDL重用以发现有用的模式并返回一组具有明确定义边界的非冗余重叠模式,并涵盖有意义的对象组。在基于MDL的数值矿工类别中,薄荷并不孤单。在论文中提出的实验中,我们表明,薄荷的表现优于竞争对手,其中薄荷薄荷的竞争对手。
Pattern mining is well established in data mining research, especially for mining binary datasets. Surprisingly, there is much less work about numerical pattern mining and this research area remains under-explored. In this paper, we propose Mint, an efficient MDL-based algorithm for mining numerical datasets. The MDL principle is a robust and reliable framework widely used in pattern mining, and as well in subgroup discovery. In Mint we reuse MDL for discovering useful patterns and returning a set of non-redundant overlapping patterns with well-defined boundaries and covering meaningful groups of objects. Mint is not alone in the category of numerical pattern miners based on MDL. In the experiments presented in the paper we show that Mint outperforms competitors among which Slim and RealKrimp.