在高维度中进行无梯度优化的随机子空间方法

论文标题

在高维度中进行无梯度优化的随机子空间方法

A stochastic subspace approach to gradient-free optimization in high dimensions

论文作者

Kozak, David, Becker, Stephen, Doostan, Alireza, Tenorio, Luis

论文摘要

我们提出了一种随机下降算法，用于无限制优化，当目标函数评估缓慢并且不容易获得梯度时，它特别有效，就像某些PDE受限的优化和机器学习问题一样。该算法将梯度映射到每次迭代时尺寸$ \ ell $的低维随机子空间上，类似于坐标下降，但不限制沿轴的方向衍生物。不需要完整的梯度，可以通过计算$ \ ell $方向导数（例如，通过前向模式自动差异化）来执行此映射。我们为在各种凸度假设以及较强的倾向性下的概率收敛结果提供了证据。我们的方法将众所周知的高斯平滑技术扩展到大于一个尺寸的子空间的下降，当每次迭代使用多个方向性衍生物时，为高斯平滑的新分析打开了大门。我们还提供了约翰逊·林斯特劳斯引理特殊案例的有限维度变体。在实验上，我们表明我们的方法比较协调下降，高斯平滑，梯度下降和BFG（当梯度通过前向模式自动分化计算时）在机器学习和形状优化文献中的问题上进行了比较。

We present a stochastic descent algorithm for unconstrained optimization that is particularly efficient when the objective function is slow to evaluate and gradients are not easily obtained, as in some PDE-constrained optimization and machine learning problems. The algorithm maps the gradient onto a low-dimensional random subspace of dimension $\ell$ at each iteration, similar to coordinate descent but without restricting directional derivatives to be along the axes. Without requiring a full gradient, this mapping can be performed by computing $\ell$ directional derivatives (e.g., via forward-mode automatic differentiation). We give proofs for convergence in expectation under various convexity assumptions as well as probabilistic convergence results under strong-convexity. Our method extends the well-known Gaussian smoothing technique to descent in subspaces of dimension greater than one, opening the doors to new analysis of Gaussian smoothing when more than one directional derivative is used at each iteration. We also provide a finite-dimensional variant of a special case of the Johnson-Lindenstrauss lemma. Experimentally, we show that our method compares favorably to coordinate descent, Gaussian smoothing, gradient descent and BFGS (when gradients are calculated via forward-mode automatic differentiation) on problems from the machine learning and shape optimization literature.

下载PDF全文

下载文献需遵守相关版权规定

论文标题