论文标题
通过层次策略学习,在接触丰富的精确任务中进行多模式融合
Multi-Modal Fusion in Contact-Rich Precise Tasks via Hierarchical Policy Learning
论文作者
论文摘要
视觉和力反馈组合在富含接触的机器人操作任务中起着至关重要的作用。当前的方法着重于开发围绕单个模式的反馈控制,同时低估了传感器的协同作用。融合不同的传感器方式是必要的,但仍然具有挑战性。一个关键的挑战是要获得有效的多模式和普遍的控制方案,以精确地使用新颖对象。本文提出了使用分层策略学习的实用多模式传感器融合机制。首先,我们使用一个自我监督的编码器,该编码器提取多视觉特征和调节力行为的混合运动/力控制器。接下来,通过在强化学习(RL)算法中的视觉,力和本体感受数据的层次整合来简化多模式融合。此外,通过层次策略学习,控制方案可以利用视觉反馈限制,并探索单个模式在精确任务中的贡献。实验表明,具有控制方案的机器人可以在模拟中使用0.25mm间隙组装对象。该系统可以推广到广泛的初始配置和新形状。实验验证了模拟系统可以在不进行微调的情况下将模拟系统牢固地转移到现实中。
Combined visual and force feedback play an essential role in contact-rich robotic manipulation tasks. Current methods focus on developing the feedback control around a single modality while underrating the synergy of the sensors. Fusing different sensor modalities is necessary but remains challenging. A key challenge is to achieve an effective multi-modal and generalized control scheme to novel objects with precision. This paper proposes a practical multi-modal sensor fusion mechanism using hierarchical policy learning. To begin with, we use a self-supervised encoder that extracts multi-view visual features and a hybrid motion/force controller that regulates force behaviors. Next, the multi-modality fusion is simplified by hierarchical integration of the vision, force, and proprioceptive data in the reinforcement learning (RL) algorithm. Moreover, with hierarchical policy learning, the control scheme can exploit the visual feedback limits and explore the contribution of individual modality in precise tasks. Experiments indicate that robots with the control scheme could assemble objects with 0.25mm clearance in simulation. The system could be generalized to widely varied initial configurations and new shapes. Experiments validate that the simulated system can be robustly transferred to reality without fine-tuning.