论文标题
让我至少了解您真正喜欢的东西:学习偏好时要处理嘈杂的人
Let Me At Least Learn What You Really Like: Dealing With Noisy Humans When Learning Preferences
论文作者
论文摘要
学习人类的偏好可以提高与人类互动的质量。可用于学习偏好的查询数量可能有限,尤其是在与人互动时,因此必须进行积极学习。积极学习的一种方法是使用不确定性抽样来决定查询的信息。在本文中,我们建议对不确定性抽样进行修改,该采样使用预期的输出值来帮助加快偏好的学习。我们将方法与不确定性采样基线进行比较,并进行消融研究以测试我们方法的每个组成部分的有效性。
Learning the preferences of a human improves the quality of the interaction with the human. The number of queries available to learn preferences maybe limited especially when interacting with a human, and so active learning is a must. One approach to active learning is to use uncertainty sampling to decide the informativeness of a query. In this paper, we propose a modification to uncertainty sampling which uses the expected output value to help speed up learning of preferences. We compare our approach with the uncertainty sampling baseline, as well as conduct an ablation study to test the validity of each component of our approach.