论文标题
正则化在高维嘈杂高斯混合物分类中的作用
The role of regularization in classification of high-dimensional noisy Gaussian mixture
论文作者
论文摘要
我们考虑了两个高斯人在嘈杂的政权中的高维混合物,即使是甲骨文也知道簇中心的中心都会错误地分类其中的一小部分但有限的分数。我们在高维限制中对正规凸,铰链和逻辑回归在内的正规凸准分类器的概括错误进行了严格分析,其中样本的数字$ n $及其尺寸$ d $ to Infinity to Infinity to Infinity to nist oble to $ a = $α= n/d $。我们讨论了正规化的令人惊讶的效果,在某些情况下可以达到贝叶斯最佳性能。我们还说明了低正规化时的插值峰,并分析了两个簇的各个大小的作用。
We consider a high-dimensional mixture of two Gaussians in the noisy regime where even an oracle knowing the centers of the clusters misclassifies a small but finite fraction of the points. We provide a rigorous analysis of the generalization error of regularized convex classifiers, including ridge, hinge and logistic regression, in the high-dimensional limit where the number $n$ of samples and their dimension $d$ go to infinity while their ratio is fixed to $α= n/d$. We discuss surprising effects of the regularization that in some cases allows to reach the Bayes-optimal performances. We also illustrate the interpolation peak at low regularization, and analyze the role of the respective sizes of the two clusters.