论文标题

非凸的随机优化中的二阶信息:功率和局限性

Second-Order Information in Non-Convex Stochastic Optimization: Power and Limitations

论文作者

Arjevani, Yossi, Carmon, Yair, Duchi, John C., Foster, Dylan J., Sekhari, Ayush, Sridharan, Karthik

论文摘要

我们设计了一种算法,该算法使用$ O(均为$ \ | \ | \ | \ | \ | \ | \ | \ | \ | \ | \leε$),使用$ O(ε^{ - 3})$随机梯度和Hessian-vector产品,与以前仅在有强大的访问范围内可用的匹配的保证,并且可以访问多个Queries。我们证明了一个下限,它确定该速率是最佳的,而且 - 令人惊讶的是,即使物镜的第一个$ p $衍生品是Lipschitz,也无法使用随机$ p $ th订单方法对其进行改进。总之,这些结果表征了非凸的随机优化与二阶方法及以后的复杂性。将我们的范围扩展到寻找$(ε,γ)$ - 近似二阶固定点的甲骨文复杂性,我们为随机二阶方法建立了几乎匹配的上和下限。即使在嘈杂的情况下,我们的下界也是新颖的。

We design an algorithm which finds an $ε$-approximate stationary point (with $\|\nabla F(x)\|\le ε$) using $O(ε^{-3})$ stochastic gradient and Hessian-vector products, matching guarantees that were previously available only under a stronger assumption of access to multiple queries with the same random seed. We prove a lower bound which establishes that this rate is optimal and---surprisingly---that it cannot be improved using stochastic $p$th order methods for any $p\ge 2$, even when the first $p$ derivatives of the objective are Lipschitz. Together, these results characterize the complexity of non-convex stochastic optimization with second-order methods and beyond. Expanding our scope to the oracle complexity of finding $(ε,γ)$-approximate second-order stationary points, we establish nearly matching upper and lower bounds for stochastic second-order methods. Our lower bounds here are novel even in the noiseless case.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源