论文标题
Huber-Energy测量量化
Huber-energy measure quantization
论文作者
论文摘要
我们描述了一个度量量化过程,即一种算法,该算法发现目标概率定律的最佳近似值(以及更普遍签名的有限变化度量),总和为$ q $ dirac smoses($ q $是量化参数)。该过程是通过最大程度地减少原始度量与其量化版本之间的统计距离来实现的;该距离是由负定的内核建立的,如有必要,可以随机计算并馈入随机优化算法(例如SGD,Adam,...)。从理论上讲,我们研究了最佳度量量化器的存在的基本问题,并确定保证合适行为所需的内核属性是什么。我们提出了两个最佳的线性无偏(蓝色)估计量,用于平方统计距离,并将其用于称为HEMQ的无偏过程中,以找到最佳的量化。我们在几个数据库上测试HEMQ:多维高斯混合物,Wiener Space Cubatory,意大利葡萄酒品种和MNIST图像数据库。结果表明,HEMQ算法是稳健且多功能的,对于Huber-Energy核的类别,与预期的直观行为相匹配。
We describe a measure quantization procedure i.e., an algorithm which finds the best approximation of a target probability law (and more generally signed finite variation measure) by a sum of $Q$ Dirac masses ($Q$ being the quantization parameter). The procedure is implemented by minimizing the statistical distance between the original measure and its quantized version; the distance is built from a negative definite kernel and, if necessary, can be computed on the fly and feed to a stochastic optimization algorithm (such as SGD, Adam, ...). We investigate theoretically the fundamental questions of existence of the optimal measure quantizer and identify what are the required kernel properties that guarantee suitable behavior. We propose two best linear unbiased (BLUE) estimators for the squared statistical distance and use them in an unbiased procedure, called HEMQ, to find the optimal quantization. We test HEMQ on several databases: multi-dimensional Gaussian mixtures, Wiener space cubature, Italian wine cultivars and the MNIST image database. The results indicate that the HEMQ algorithm is robust and versatile and, for the class of Huber-energy kernels, matches the expected intuitive behavior.