论文标题

最小值多余风险的信息理论分析

Information-Theoretic Analysis of Minimax Excess Risk

论文作者

Hafez-Kolahi, Hassan, Moniri, Behrad, Kasaei, Shohreh

论文摘要

机器学习理论中研究的两个主要概念是泛化差距(火车和测试错误之间的差异)和多余的风险(测试误差和最小误差之间的差异)。尽管信息理论工具已被广泛用于研究学习算法的概括差距,但尚未对多余风险的信息理论性质进行全面研究。在本文中,朝着这个目标采取了一些步骤。我们将最小值多余风险的常见问题视为算法设计师与世界之间的零和游戏。然后,我们认为,希望以可以交换游戏顺序修改该游戏。然后,我们证明,在某些规律性的条件下,如果世界和设计师可以随机播放二元性差距为零,并且可以更改游戏顺序。在这种情况下,双重表示中的贝叶斯问题浮出水面。这使得可以利用贝叶斯学习中最低多余风险的最低多余风险的最新信息理论结果来提供最小值多余的风险。我们通过在两个重要的问题类别中提供信息理论洞察来证明结果的适用性:当假设空间具有有限的VC维度和正规化最小二乘时的分类。

Two main concepts studied in machine learning theory are generalization gap (difference between train and test error) and excess risk (difference between test error and the minimum possible error). While information-theoretic tools have been used extensively to study the generalization gap of learning algorithms, the information-theoretic nature of excess risk has not yet been fully investigated. In this paper, some steps are taken toward this goal. We consider the frequentist problem of minimax excess risk as a zero-sum game between the algorithm designer and the world. Then, we argue that it is desirable to modify this game in a way that the order of play can be swapped. We then prove that, under some regularity conditions, if the world and designer can play randomly the duality gap is zero and the order of play can be changed. In this case, a Bayesian problem surfaces in the dual representation. This makes it possible to utilize recent information-theoretic results on minimum excess risk in Bayesian learning to provide bounds on the minimax excess risk. We demonstrate the applicability of the results by providing information theoretic insight on two important classes of problems: classification when the hypothesis space has finite VC-dimension, and regularized least squares.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源