论文标题
地球搬运工距离的期望值的概括
A generalization for the expected value of the earth mover's distance
论文作者
论文摘要
在集合$ [n] = \ {1,\ dots,n \} $上比较了地球移动者的距离(EMD),也称为第一个Wasserstein距离,以比较任意的许多概率分布而不是两个。我们介绍了这种概括的细节,以及受组合学启发的高效算法;事实证明,在三个分布的特殊情况下,EMD是成对EMD的总和的一半。扩展了Bourn和Willenbring的方法(Arxiv:1903.03673),我们使用与希尔伯特系列Segre嵌入的Hilbert系列相吻合,计算了该广义EMD在随机$ d $ TUPALES上的期望值。然后,我们使用EMD来分析现实世界的等级分布数据集。
The earth mover's distance (EMD), also called the first Wasserstein distance, can be naturally extended to compare arbitrarily many probability distributions, rather than only two, on the set $[n]=\{1,\dots,n\}$. We present the details for this generalization, along with a highly efficient algorithm inspired by combinatorics; it turns out that in the special case of three distributions, the EMD is half the sum of the pairwise EMD's. Extending the methods of Bourn and Willenbring (arXiv:1903.03673), we compute the expected value of this generalized EMD on random $d$-tuples of distributions, using a generating function which coincides with the Hilbert series of the Segre embedding. We then use the EMD to analyze a real-world data set of grade distributions.