互动：在网络上分散的双层学习中实现低样本和沟通复杂性

论文标题

互动：在网络上分散的双层学习中实现低样本和沟通复杂性

INTERACT: Achieving Low Sample and Communication Complexities in Decentralized Bilevel Learning over Networks

论文作者

Liu, Zhuqing, Zhang, Xin, Khanduri, Prashant, Lu, Songtao, Liu, Jia

论文摘要

近年来，由于它们在对点对点网络上的分散性学习问题（例如，多机构元学习，多基因学习，个性化的培训，个性化的培训和拜占庭式学习）方面对分散的学习问题进行建模，分散的双层优化问题在网络和机器学习社区中引起了越来越多的关注）。但是，对于具有有限的计算和通信功能的对等网络上的分散双重优化，如何实现低样本和通信复杂性是迄今为止尚未探索的两个基本挑战。在本文中，我们首次尝试研究了分别与外部和内部子问题相对应的非凸和强结构结构的分散的双重优化问题。本文中我们的主要贡献是两个折叠的：i）我们首先提出了一种称为Interact的确定性算法（内部成分 - 偏度 - 偏度梯度），需要$ \ MATHCAL {O}（O}（O}（nε^{ - 1}）$的样本复杂性，以及$ \ MATHCAL的通信复杂性，以下问题，其中$ n $和$ε> 0 $是每个代理和所需的平稳间隙的样本数。 ii）为了放松每次迭代中进行全面梯度评估的需求，我们提出了一个随机方差的互动版本（SVR-Interact），将样品复杂性提高到$ \ Mathcal {o}（\ sqrt {\ sqrt {n}ε^{-1}）$，同时实现同一通信复杂的复杂性和确定性的algorithm。据我们所知，这项工作是第一个可以实现低样本和通信复杂性来解决网络上的分散双层优化问题的较低的沟通复杂性。我们的数值实验也证实了我们的理论发现。

In recent years, decentralized bilevel optimization problems have received increasing attention in the networking and machine learning communities thanks to their versatility in modeling decentralized learning problems over peer-to-peer networks (e.g., multi-agent meta-learning, multi-agent reinforcement learning, personalized training, and Byzantine-resilient learning). However, for decentralized bilevel optimization over peer-to-peer networks with limited computation and communication capabilities, how to achieve low sample and communication complexities are two fundamental challenges that remain under-explored so far. In this paper, we make the first attempt to investigate the class of decentralized bilevel optimization problems with nonconvex and strongly-convex structure corresponding to the outer and inner subproblems, respectively. Our main contributions in this paper are two-fold: i) We first propose a deterministic algorithm called INTERACT (inner-gradient-descent-outer-tracked-gradient) that requires the sample complexity of $\mathcal{O}(n ε^{-1})$ and communication complexity of $\mathcal{O}(ε^{-1})$ to solve the bilevel optimization problem, where $n$ and $ε> 0$ are the number of samples at each agent and the desired stationarity gap, respectively. ii) To relax the need for full gradient evaluations in each iteration, we propose a stochastic variance-reduced version of INTERACT (SVR-INTERACT), which improves the sample complexity to $\mathcal{O}(\sqrt{n} ε^{-1})$ while achieving the same communication complexity as the deterministic algorithm. To our knowledge, this work is the first that achieves both low sample and communication complexities for solving decentralized bilevel optimization problems over networks. Our numerical experiments also corroborate our theoretical findings.

下载PDF全文

下载文献需遵守相关版权规定

论文标题