通过弱监督的分解从演示中学习

论文标题

通过弱监督的分解从演示中学习

Learning from Demonstration with Weakly Supervised Disentanglement

论文作者

Hristov, Yordan, Ramamoorthy, Subramanian

论文摘要

机器人操纵任务，例如用软海绵擦拭，需要从多种丰富的感觉方式控制。在这种环境中，旨在教授机器人的人类机器人互动很困难，因为人类对丰富数据流的人与机器理解之间的不匹配可能不匹配。我们将可解释的学习的任务视为概率生成模型的优化问题。为了说明数据的高维度，选择了一个高容量神经网络来代表模型。该模型中的潜在变量与一组演示中表现出的高级概念和概念明确对齐。我们表明，与设计师在“潜在变量”上选择先前选择的常规方法相比，在适当限制的词汇中，最好使用最终用户的标签来实现这种对齐。我们的方法是在PR2机器人执行的两个桌上机器人操作任务的背景下进行评估的，即用海绵擦拭液体的液体（强力按海绵并沿表面移动）并倒在不同的容器之间。机器人提供视觉信息，臂关节位置和手臂联合努力。我们已经制作了可用的任务和数据的视频 - 请参阅以下网址：https：//sites.google.com/view/weak-label-lfd。

Robotic manipulation tasks, such as wiping with a soft sponge, require control from multiple rich sensory modalities. Human-robot interaction, aimed at teaching robots, is difficult in this setting as there is potential for mismatch between human and machine comprehension of the rich data streams. We treat the task of interpretable learning from demonstration as an optimisation problem over a probabilistic generative model. To account for the high-dimensionality of the data, a high-capacity neural network is chosen to represent the model. The latent variables in this model are explicitly aligned with high-level notions and concepts that are manifested in a set of demonstrations. We show that such alignment is best achieved through the use of labels from the end user, in an appropriately restricted vocabulary, in contrast to the conventional approach of the designer picking a prior over the latent variables. Our approach is evaluated in the context of two table-top robot manipulation tasks performed by a PR2 robot -- that of dabbing liquids with a sponge (forcefully pressing a sponge and moving it along a surface) and pouring between different containers. The robot provides visual information, arm joint positions and arm joint efforts. We have made videos of the tasks and data available - see supplementary materials at: https://sites.google.com/view/weak-label-lfd.

下载PDF全文

下载文献需遵守相关版权规定

论文标题