零级正规化优化（Zoro）：大约稀疏的梯度和自适应采样

论文标题

零级正规化优化（Zoro）：大约稀疏的梯度和自适应采样

Zeroth-Order Regularized Optimization (ZORO): Approximately Sparse Gradients and Adaptive Sampling

论文作者

Cai, HanQin, Mckenzie, Daniel, Yin, Wotao, Zhang, Zhenliang

论文摘要

我们考虑将高维度函数最小化的问题，该问题可能包括使用（可能是嘈杂的）函数评估的正则化项。这种优化也称为无衍生物，零订单或黑盒优化。我们提出了一个新的$ \ textbf {z} $ eroth-$ \ textbf {o} $ rder $ \ textbf {r} $ egularized $ \ textbf {o} $ ptimization $ ptimization方法，称为zoro。当迭代处的基础梯度大约稀疏时，Zoro几乎需要目标函数评估来获得降低目标函数的新迭代。我们通过自适应，随机梯度估计器实现这一目标，然后是不精确的近端梯度方案。在新的近似稀疏梯度假设和各种不同的凸设置下，我们表明佐罗的（理论和经验）收敛速率仅在对数上取决于问题维度。数值实验表明，Zoro在合成数据集和真实数据集上均优于具有相似假设的现有方法。

We consider the problem of minimizing a high-dimensional objective function, which may include a regularization term, using (possibly noisy) evaluations of the function. Such optimization is also called derivative-free, zeroth-order, or black-box optimization. We propose a new $\textbf{Z}$eroth-$\textbf{O}$rder $\textbf{R}$egularized $\textbf{O}$ptimization method, dubbed ZORO. When the underlying gradient is approximately sparse at an iterate, ZORO needs very few objective function evaluations to obtain a new iterate that decreases the objective function. We achieve this with an adaptive, randomized gradient estimator, followed by an inexact proximal-gradient scheme. Under a novel approximately sparse gradient assumption and various different convex settings, we show the (theoretical and empirical) convergence rate of ZORO is only logarithmically dependent on the problem dimension. Numerical experiments show that ZORO outperforms the existing methods with similar assumptions, on both synthetic and real datasets.

下载PDF全文

下载文献需遵守相关版权规定

论文标题