异步分布式双层优化

论文标题

异步分布式双层优化

Asynchronous Distributed Bilevel Optimization

论文作者

Jiao, Yang, Yang, Kai, Wu, Tiancheng, Song, Dongjin, Jian, Chengtao

论文摘要

二重性优化在许多机器学习任务中起着至关重要的作用，范围从高参数优化到元学习。然而，现有关于双杆优化的研究集中在集中式或同步分布式设置上。集中式的双层优化方法需要向单个服务器收集大量数据，这不可避免地会产生巨大的通信费用，并可能引起数据隐私风险。另一方面，同步分布式的二元优化算法通常会面临Straggler问题，如果一些工人无法做出响应，将立即停止工作。作为一种补救措施，我们提出了异步分布的双层优化（ADBO）算法。拟议的ADBO可以通过非凸高层和下层目标函数来解决双重优化问题，并且从理论上可以保证其收敛性。此外，通过理论分析揭示了ADBO的迭代复杂性获得$ε$ - 定位点的上限由$ \ Mathcal {O}（\ frac {1} {1} {ε^2}）$。已经对公共数据集进行了彻底的经验研究，以阐明拟议的ADBO的有效性和效率。

Bilevel optimization plays an essential role in many machine learning tasks, ranging from hyperparameter optimization to meta-learning. Existing studies on bilevel optimization, however, focus on either centralized or synchronous distributed setting. The centralized bilevel optimization approaches require collecting massive amount of data to a single server, which inevitably incur significant communication expenses and may give rise to data privacy risks. Synchronous distributed bilevel optimization algorithms, on the other hand, often face the straggler problem and will immediately stop working if a few workers fail to respond. As a remedy, we propose Asynchronous Distributed Bilevel Optimization (ADBO) algorithm. The proposed ADBO can tackle bilevel optimization problems with both nonconvex upper-level and lower-level objective functions, and its convergence is theoretically guaranteed. Furthermore, it is revealed through theoretic analysis that the iteration complexity of ADBO to obtain the $ε$-stationary point is upper bounded by $\mathcal{O}(\frac{1}{ε^2})$. Thorough empirical studies on public datasets have been conducted to elucidate the effectiveness and efficiency of the proposed ADBO.

下载PDF全文

下载文献需遵守相关版权规定

论文标题