论文标题
基于内存的凝视预测机器人操纵的深度模仿学习
Memory-based gaze prediction in deep imitation learning for robot manipulation
论文作者
论文摘要
深度模仿学习是一种有前途的方法,不需要自动机器人操纵中的硬编码控制规则。在当前时间步骤中,基于状态,深度模仿学习对机器人操纵的当前应用仅限于反应性控制。但是,还将需要未来的机器人来解决通过复杂环境中经验获得的内存(例如,当要求机器人在架子上找到先前使用的对象时)。在这种情况下,由于复杂的环境引起的分心,简单的深层学习学习可能会失败。我们建议从顺序视觉输入中凝视预测,使机器人可以执行需要内存的操纵任务。所提出的算法使用基于变压器的自发架构进行基于顺序数据的凝视估算以实现内存。使用真实的机器人多对象操作任务评估了所提出的方法,该任务需要对先前状态进行内存。
Deep imitation learning is a promising approach that does not require hard-coded control rules in autonomous robot manipulation. The current applications of deep imitation learning to robot manipulation have been limited to reactive control based on the states at the current time step. However, future robots will also be required to solve tasks utilizing their memory obtained by experience in complicated environments (e.g., when the robot is asked to find a previously used object on a shelf). In such a situation, simple deep imitation learning may fail because of distractions caused by complicated environments. We propose that gaze prediction from sequential visual input enables the robot to perform a manipulation task that requires memory. The proposed algorithm uses a Transformer-based self-attention architecture for the gaze estimation based on sequential data to implement memory. The proposed method was evaluated with a real robot multi-object manipulation task that requires memory of the previous states.