贝叶斯内核两样本测试

论文标题

贝叶斯内核两样本测试

Bayesian Kernel Two-Sample Testing

论文作者

Zhang, Qinyi, Wild, Veit, Filippi, Sarah, Flaxman, Seth, Sejdinovic, Dino

论文摘要

在现代数据分析中，随机变量之间差异的非参数量度特别重要。该受试者在频繁的文献中得到了很好的研究，而贝叶斯环境的发展通常仅限于单变量病例。在这里，我们提出了一个基于Flaxman等人（2016）建立的框架，基于对复制核Hilbert空间中核平均嵌入量之间的差异建模的贝叶斯核两样本测试程序。内核方法的使用使其应用于多元欧几里得空间以外的通用域中的随机变量。提出的过程导致后推理方案，该方案允许自动选择与当前问题相关的内核参数。在一系列合成实验和两个实际数据实验中（即，从高维数据和六元的单核环构象比较中测试网络异质性），我们说明了方法的优势。

In modern data analysis, nonparametric measures of discrepancies between random variables are particularly important. The subject is well-studied in the frequentist literature, while the development in the Bayesian setting is limited where applications are often restricted to univariate cases. Here, we propose a Bayesian kernel two-sample testing procedure based on modelling the difference between kernel mean embeddings in the reproducing kernel Hilbert space utilising the framework established by Flaxman et al (2016). The use of kernel methods enables its application to random variables in generic domains beyond the multivariate Euclidean spaces. The proposed procedure results in a posterior inference scheme that allows an automatic selection of the kernel parameters relevant to the problem at hand. In a series of synthetic experiments and two real data experiments (i.e. testing network heterogeneity from high-dimensional data and six-membered monocyclic ring conformation comparison), we illustrate the advantages of our approach.

下载PDF全文

下载文献需遵守相关版权规定

论文标题