论文标题
通过策略适应来学习可转移的查询对象本地化的奖励
Learning Transferable Reward for Query Object Localization with Policy Adaptation
论文作者
论文摘要
我们提出了一种基于增强学习的方法来查询对象本地化的方法,为其培训了代理,以定位由小型示例集指定的感兴趣对象。我们学习了使用序数度量学习的示例性集合的可转移奖励信号。我们提出的方法使测试时间策略适应了不容易获得奖励信号的新环境,并且优于限于注释图像的微调方法。此外,可转移的奖励允许将受过训练的代理从一个特定班级重新利用。对损坏的MNIST,CU鸟和可可数据集进行了实验,证明了我们方法的有效性。
We propose a reinforcement learning based approach to query object localization, for which an agent is trained to localize objects of interest specified by a small exemplary set. We learn a transferable reward signal formulated using the exemplary set by ordinal metric learning. Our proposed method enables test-time policy adaptation to new environments where the reward signals are not readily available, and outperforms fine-tuning approaches that are limited to annotated images. In addition, the transferable reward allows repurposing the trained agent from one specific class to another class. Experiments on corrupted MNIST, CU-Birds, and COCO datasets demonstrate the effectiveness of our approach.