论文标题
朝着局部弹性稳定性的更好的概括界限
Toward Better Generalization Bounds with Locally Elastic Stability
论文作者
论文摘要
算法稳定性是确保学习算法的概括能力的关键特征。在稳定性的不同概念中,\ emph {统一稳定性}可以说是最流行的,它产生了指数概括的界限。但是,均匀的稳定性仅通过删除单个数据点而考虑到不依赖分布的单个数据点,因此可以考虑最严重的损失变化(或所谓的灵敏度)。在许多情况下,损失的最坏案例灵敏度要比删除的单个数据点所采取的平均灵敏度大得多,尤其是在某些高级模型(例如随机特征模型或神经网络)中。许多以前的作品试图通过提出稳定性较弱的概念来减轻独立的问题,但是,它们要么仅产生多项式界限,要么得出的边界不会消失,因为样本量变为无穷大。鉴于这一点,我们将\ emph {局部弹性稳定性}作为一个较弱和分布依赖性的稳定性概念,它仍然产生指数的概括界限。我们进一步证明,局部弹性稳定性意味着与基于许多情况下基于统一稳定性得出的概括界限,通过重新审视有界支持向量机器,正规化最小二平方回归和随机梯度下降的示例。
Algorithmic stability is a key characteristic to ensure the generalization ability of a learning algorithm. Among different notions of stability, \emph{uniform stability} is arguably the most popular one, which yields exponential generalization bounds. However, uniform stability only considers the worst-case loss change (or so-called sensitivity) by removing a single data point, which is distribution-independent and therefore undesirable. There are many cases that the worst-case sensitivity of the loss is much larger than the average sensitivity taken over the single data point that is removed, especially in some advanced models such as random feature models or neural networks. Many previous works try to mitigate the distribution independent issue by proposing weaker notions of stability, however, they either only yield polynomial bounds or the bounds derived do not vanish as sample size goes to infinity. Given that, we propose \emph{locally elastic stability} as a weaker and distribution-dependent stability notion, which still yields exponential generalization bounds. We further demonstrate that locally elastic stability implies tighter generalization bounds than those derived based on uniform stability in many situations by revisiting the examples of bounded support vector machines, regularized least square regressions, and stochastic gradient descent.