投影有效的亚级别方法和最佳非平滑弗兰克·沃尔夫方法

论文标题

投影有效的亚级别方法和最佳非平滑弗兰克·沃尔夫方法

Projection Efficient Subgradient Method and Optimal Nonsmooth Frank-Wolfe Method

论文作者

Thekumparampil, Kiran Koshy, Jain, Prateek, Netrapalli, Praneeth, Oh, Sewoong

论文摘要

我们考虑在凸约限制集上优化非平滑lipschitz连续凸功能的经典设置，当时可以访问该功能的（随机）一阶甲骨文（FO），以及用于约束集的（po）。众所周知，要在高维度中实现$ε$ -suboptimality，$θ（ε^{ - 2}）$ fo是必要的。这是通过投影亚级别方法（PGD）实现的。但是，PGD还需要$ o（ε^{ - 2}）$ po调用，这可能比fo呼叫（例如核定标准约束）更昂贵。尽管这个问题和广泛的文献具有基本的性质，但改善PGD的复杂性是在很大程度上没有探索的。我们首先提出了这样的改进。这只需要一个温和的假设，即，当目标函数扩展到约束集的稍大邻域时，仍然可以保持Lipschitz，并且可以通过FO访问。特别是，我们介绍了摩托艇方法，该方法仔细地结合了莫罗 - 尤西达平滑和加速的一阶方案。可以保证仅使用$ o（ε^{ - 1}）$ po调用和最佳$ o（ε^{ - 2}）$ fo调用，找到可行的$ε$ -suboptimal解决方案。 Further, instead of a PO if we only have a linear minimization oracle (LMO, a la Frank-Wolfe) to access the constraint set, an extension of our method, MOLES, finds a feasible $ε$-suboptimal solution using $O(ε^{-2})$ LMO calls and FO calls---both match known lower bounds, resolving a question left open since White (1993).我们的实验证实，这些方法比最新的PO和LMO调用问题实现了高度的加速。

We consider the classical setting of optimizing a nonsmooth Lipschitz continuous convex function over a convex constraint set, when having access to a (stochastic) first-order oracle (FO) for the function and a projection oracle (PO) for the constraint set. It is well known that to achieve $ε$-suboptimality in high-dimensions, $Θ(ε^{-2})$ FO calls are necessary. This is achieved by the projected subgradient method (PGD). However, PGD also entails $O(ε^{-2})$ PO calls, which may be computationally costlier than FO calls (e.g. nuclear norm constraints). Improving this PO calls complexity of PGD is largely unexplored, despite the fundamental nature of this problem and extensive literature. We present first such improvement. This only requires a mild assumption that the objective function, when extended to a slightly larger neighborhood of the constraint set, still remains Lipschitz and accessible via FO. In particular, we introduce MOPES method, which carefully combines Moreau-Yosida smoothing and accelerated first-order schemes. This is guaranteed to find a feasible $ε$-suboptimal solution using only $O(ε^{-1})$ PO calls and optimal $O(ε^{-2})$ FO calls. Further, instead of a PO if we only have a linear minimization oracle (LMO, a la Frank-Wolfe) to access the constraint set, an extension of our method, MOLES, finds a feasible $ε$-suboptimal solution using $O(ε^{-2})$ LMO calls and FO calls---both match known lower bounds, resolving a question left open since White (1993). Our experiments confirm that these methods achieve significant speedups over the state-of-the-art, for a problem with costly PO and LMO calls.

下载PDF全文

下载文献需遵守相关版权规定

论文标题