论文标题
学习步行:基于尖峰的增强型六边形机器人中央模式生成
Learning to Walk: Spike Based Reinforcement Learning for Hexapod Robot Central Pattern Generation
论文作者
论文摘要
学习走路 - 即在表现和能量限制下学习运动仍然是腿部机器人技术的挑战。已经探索了诸如随机梯度,深钢筋学习(RL)之类的方法,该方法已被培养给双子,四足动物和六边形。这些技术在计算密集程度上,对于边缘应用通常是过敏的。这些方法依赖于复杂的传感器和数据预处理,从而进一步增加了能量和潜伏期。峰值神经网络(SNN)的最新进展有望在计算上大幅度降低,这是由于神经的稀疏发射,并已显示将增强学习机制与生物学上观察到的峰值时间相关的可塑性(STDP)相结合。但是,尚未显示在SNN框架中学习中央模式发生器(CPG)的同步模式,尚未显示训练腿部机器人。这可以将SNN的效率与基于CPG的系统的同步运动结合,从而在移动机器人技术中提供突破性的端到端学习。在本文中,我们提出了一种基于增强的随机重量更新技术,用于培训尖峰CPG。整个系统是在带有集成传感器的轻质Raspberry Pi平台上实现的,从而开辟了令人兴奋的新可能性。
Learning to walk -- i.e., learning locomotion under performance and energy constraints continues to be a challenge in legged robotics. Methods such as stochastic gradient, deep reinforcement learning (RL) have been explored for bipeds, quadrupeds and hexapods. These techniques are computationally intensive and often prohibitive for edge applications. These methods rely on complex sensors and pre-processing of data, which further increases energy and latency. Recent advances in spiking neural networks (SNNs) promise a significant reduction in computing owing to the sparse firing of neuros and has been shown to integrate reinforcement learning mechanisms with biologically observed spike time dependent plasticity (STDP). However, training a legged robot to walk by learning the synchronization patterns of central pattern generators (CPG) in an SNN framework has not been shown. This can marry the efficiency of SNNs with synchronized locomotion of CPG based systems providing breakthrough end-to-end learning in mobile robotics. In this paper, we propose a reinforcement based stochastic weight update technique for training a spiking CPG. The whole system is implemented on a lightweight raspberry pi platform with integrated sensors, thus opening up exciting new possibilities.