适当的学习，Helly编号和最佳SVM绑定

论文标题

适当的学习，Helly编号和最佳SVM绑定

Proper Learning, Helly Number, and an Optimal SVM Bound

论文作者

Bousquet, Olivier, Hanneke, Steve, Moran, Shay, Zhivotovskiy, Nikita

论文摘要

对于任何经验风险最小化器（ERM），经典的PAC样品复杂性界限均已说明，并包含额外的对数因子$ \ log（1/ε）$，这对于ERM通常是所需的。 Hanneke（2016）最近表明，任何VC C类PAC学习的最佳样本复杂性都是通过特定的不当学习算法来实现的，该算法输出了C的特定多数假设的多数票数C，此问题的问题是，该问题的问题是，该问题何时可以通过适当的学习算法来实现该界限，该算法始终从C. C. c.下输出C. C. C. Hypothessis的限制。在本文中，我们旨在表征通过适当的学习算法可以实现最佳样本复杂性的类。我们确定这些类可以由双重螺旋数来表征，这是在离散几何和抽象凸度中产生的组合参数。特别是，在C的一般条件下，我们表明双重螺旋数在且仅当有适当的学习者获得对$ε$和$δ$的最佳关节依赖性时才有限。作为我们技术的进一步含义，我们解决了Vapnik和Chervonenkis（1974）对支持向量机的性能提出的长期开放问题，证明在可实现的情况下SVM的样本复杂性为$θ（（n/ε）+（1/ε）+（1/ε）\ log log（1/δ）这给出了通过适当的学习算法实现的半空间的第一个最佳PAC结合，此外，计算上是有效的。

The classical PAC sample complexity bounds are stated for any Empirical Risk Minimizer (ERM) and contain an extra logarithmic factor $\log(1/ε)$ which is known to be necessary for ERM in general. It has been recently shown by Hanneke (2016) that the optimal sample complexity of PAC learning for any VC class C is achieved by a particular improper learning algorithm, which outputs a specific majority-vote of hypotheses in C. This leaves the question of when this bound can be achieved by proper learning algorithms, which are restricted to always output a hypothesis from C. In this paper we aim to characterize the classes for which the optimal sample complexity can be achieved by a proper learning algorithm. We identify that these classes can be characterized by the dual Helly number, which is a combinatorial parameter that arises in discrete geometry and abstract convexity. In particular, under general conditions on C, we show that the dual Helly number is bounded if and only if there is a proper learner that obtains the optimal joint dependence on $ε$ and $δ$. As further implications of our techniques we resolve a long-standing open problem posed by Vapnik and Chervonenkis (1974) on the performance of the Support Vector Machine by proving that the sample complexity of SVM in the realizable case is $Θ((n/ε)+(1/ε)\log(1/δ))$, where $n$ is the dimension. This gives the first optimal PAC bound for Halfspaces achieved by a proper learning algorithm, and moreover is computationally efficient.

下载PDF全文

下载文献需遵守相关版权规定

论文标题