论文标题
Kinova Gemini:交互式机器人用视觉推理和对话AI抓住
Kinova Gemini: Interactive Robot Grasping with Visual Reasoning and Conversational AI
论文作者
论文摘要
为了促进机器人技术和AI的最新进展,以进行人类和机器之间的微妙合作,我们提出了Kinova Gemini,这是一种原始的机器人系统,该系统将对话性AI对话和视觉推理整合,以使Kinova Gen3 Lite Robot帮助人们撤回对象或完全基于基于感知的挑选和放置位置任务。当一个人走到Kinova Gen3 Lite时,我们的Kinova Gemini能够在三种不同的应用程序中满足用户的要求:(1)它可以与人们进行自然对话,以互动并协助人类检索物体并将其交给用户。 (2)它使用Yolo V3检测到不同的对象,并识别项目的颜色属性,以询问人们是否想通过对话来掌握它,或者使用户可以选择需要哪个特定特定的对象。 (3)它应用了Yolo V3识别多个对象,并让您选择两个基于感知的选择任务,例如“将香蕉放入碗中”,并具有视觉推理和对话交互。
To facilitate recent advances in robotics and AI for delicate collaboration between humans and machines, we propose the Kinova Gemini, an original robotic system that integrates conversational AI dialogue and visual reasoning to make the Kinova Gen3 lite robot help people retrieve objects or complete perception-based pick-and-place tasks. When a person walks up to Kinova Gen3 lite, our Kinova Gemini is able to fulfill the user's requests in three different applications: (1) It can start a natural dialogue with people to interact and assist humans to retrieve objects and hand them to the user one by one. (2) It detects diverse objects with YOLO v3 and recognize color attributes of the item to ask people if they want to grasp it via the dialogue or enable the user to choose which specific one is required. (3) It applies YOLO v3 to recognize multiple objects and let you choose two items for perception-based pick-and-place tasks such as "Put the banana into the bowl" with visual reasoning and conversational interaction.