论文标题
操纵强化学习:质量信号的中毒攻击
Manipulating Reinforcement Learning: Poisoning Attacks on Cost Signals
论文作者
论文摘要
本章研究了有关增强学习(RL)的网络攻击,并引入了一种定量方法来分析RL的脆弱性。为了关注对抗性信号的对抗操作,我们分析了操纵下的TD($λ$)和$ Q $ - 学习算法的性能降解。对于TD($λ$),从操纵成本中学到的近似值具有与攻击大小成正比的近似误差。对抗攻击对界限的影响不取决于$λ$的选择。在$ q $ - 学习中,我们表明$ q $ - 学习算法在隐秘攻击和构成伪造的构成信号下会收敛。我们表征了伪造的成本与$ Q $ factors之间的关系,以及学习代理所学的政策,该政策为可行的进攻和防御性动作提供了基本限制。我们就对手永远无法实现目标政策的成本提出了一个健壮的地区。我们提供有关伪造成本的条件,这些条件可能会误导代理商学习对手的偏爱政策。提供了TD($λ$)学习的案例研究,以证实结果。
This chapter studies emerging cyber-attacks on reinforcement learning (RL) and introduces a quantitative approach to analyze the vulnerabilities of RL. Focusing on adversarial manipulation on the cost signals, we analyze the performance degradation of TD($λ$) and $Q$-learning algorithms under the manipulation. For TD($λ$), the approximation learned from the manipulated costs has an approximation error bound proportional to the magnitude of the attack. The effect of the adversarial attacks on the bound does not depend on the choice of $λ$. In $Q$-learning, we show that $Q$-learning algorithms converge under stealthy attacks and bounded falsifications on cost signals. We characterize the relation between the falsified cost and the $Q$-factors as well as the policy learned by the learning agent which provides fundamental limits for feasible offensive and defensive moves. We propose a robust region in terms of the cost within which the adversary can never achieve the targeted policy. We provide conditions on the falsified cost which can mislead the agent to learn an adversary's favored policy. A case study of TD($λ$) learning is provided to corroborate the results.