与状态依赖的马尔可夫数据约束的随机非covex优化

论文标题

与状态依赖的马尔可夫数据约束的随机非covex优化

Constrained Stochastic Nonconvex Optimization with State-dependent Markov Data

论文作者

Roy, Abhishek, Balasubramanian, Krishnakumar, Ghadimi, Saeed

论文摘要

我们研究了Markovian数据受约束的非凸随机优化问题的随机优化算法。特别是，我们专注于马尔可夫链的过渡内核是国家依赖的情况。这种随机优化问题出现在各种机器学习问题中，包括战略分类和强化学习。对于这个问题，我们研究基于投影的算法和无投影算法。在这两种情况下，我们都确定了随机一阶甲骨文获得适当定义的$ε$ -Stationary点的呼叫数量是$ \ MATHCAL {O}（1/ε^{2.5}）$的顺序。在无投影设置中，我们还确定了对线性最小化oracle的调用数量是$ \ MATHCAL {O}（1/ε^{5.5}）$。我们还凭经验证明了算法在神经网络中战略分类问题上的性能。

We study stochastic optimization algorithms for constrained nonconvex stochastic optimization problems with Markovian data. In particular, we focus on the case when the transition kernel of the Markov chain is state-dependent. Such stochastic optimization problems arise in various machine learning problems including strategic classification and reinforcement learning. For this problem, we study both projection-based and projection-free algorithms. In both cases, we establish that the number of calls to the stochastic first-order oracle to obtain an appropriately defined $ε$-stationary point is of the order $\mathcal{O}(1/ε^{2.5})$. In the projection-free setting we additionally establish that the number of calls to the linear minimization oracle is of order $\mathcal{O}(1/ε^{5.5})$. We also empirically demonstrate the performance of our algorithm on the problem of strategic classification with neural networks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题