论文标题
使用实际因果关系的道德加强学习
Moral reinforcement learning using actual causation
论文作者
论文摘要
强化学习系统将在更大程度上做出重大影响人类福祉的决策,因此,这些系统必须做出符合我们对道德上良好行为的期望的决策。道德上的好处通常是通过因果术语来定义的,就像一个人的行为实际上导致了特定的结果,以及是否可以预期结果。我们提出了一种在线加强学习方法,该方法在限制下学习一项政策,即代理人不应成为损害的原因。这是通过使用实际因果关系理论来定义原因并将其归咎于代理人的实际原因的实际原因而实现的。我们对玩具道德困境进行实验,在这种难题中,自然选择奖励功能会导致明显不受欢迎的行为,但是我们的方法学会了一项避免成为有害行为的原因的政策,证明了我们方法的合理性。允许代理商在观察因果道德差异(例如责备)的同时学习,为学习更好地符合我们的道德判断的政策打开了可能性。
Reinforcement learning systems will to a greater and greater extent make decisions that significantly impact the well-being of humans, and it is therefore essential that these systems make decisions that conform to our expectations of morally good behavior. The morally good is often defined in causal terms, as in whether one's actions have in fact caused a particular outcome, and whether the outcome could have been anticipated. We propose an online reinforcement learning method that learns a policy under the constraint that the agent should not be the cause of harm. This is accomplished by defining cause using the theory of actual causation and assigning blame to the agent when its actions are the actual cause of an undesirable outcome. We conduct experiments on a toy ethical dilemma in which a natural choice of reward function leads to clearly undesirable behavior, but our method learns a policy that avoids being the cause of harmful behavior, demonstrating the soundness of our approach. Allowing an agent to learn while observing causal moral distinctions such as blame, opens the possibility to learning policies that better conform to our moral judgments.