间接和直接培训尖峰神经网络，用于端到端控制车道的车辆

论文标题

间接和直接培训尖峰神经网络，用于端到端控制车道的车辆

Indirect and Direct Training of Spiking Neural Networks for End-to-End Control of a Lane-Keeping Vehicle

论文作者

Bing, Zhenshan, Meschede, Claus, Chen, Guang, Knoll, Alois, Huang, Kai

论文摘要

基于生物突触可塑性的建筑尖峰神经网络（SNN）具有实现快速和节能计算的潜力，这对移动机器人应用有益。但是，由于缺乏实际的培训方法，机器人领域中SNN的实现受到限制。因此，在本文中，我们介绍了巷道保管车辆的SNN的间接和直接端到端培训方法。首先，我们采用了使用\ textColor {black} {deep q-Learning}（dqn）算法学习的策略，然后随后使用有监督的学习将其传输到SNN。其次，我们将直接培训SNN采用奖励调制的峰值依赖性可塑性（R-STDP），因为它结合了增强学习的优势和众所周知的尖峰定时依赖性可塑性（STDP）。我们在三种情况下检查了所提出的方法，其中通过使用基于事件的神经形态视觉传感器来控制机器人以保持车道标记的状态。我们通过将它们与本文中介绍的其他三种算法进行比较，进一步证明了R-STDP方法的优势。

Building spiking neural networks (SNNs) based on biological synaptic plasticities holds a promising potential for accomplishing fast and energy-efficient computing, which is beneficial to mobile robotic applications. However, the implementations of SNNs in robotic fields are limited due to the lack of practical training methods. In this paper, we therefore introduce both indirect and direct end-to-end training methods of SNNs for a lane-keeping vehicle. First, we adopt a policy learned using the \textcolor{black}{Deep Q-Learning} (DQN) algorithm and then subsequently transfer it to an SNN using supervised learning. Second, we adopt the reward-modulated spike-timing-dependent plasticity (R-STDP) for training SNNs directly, since it combines the advantages of both reinforcement learning and the well-known spike-timing-dependent plasticity (STDP). We examine the proposed approaches in three scenarios in which a robot is controlled to keep within lane markings by using an event-based neuromorphic vision sensor. We further demonstrate the advantages of the R-STDP approach in terms of the lateral localization accuracy and training time steps by comparing them with other three algorithms presented in this paper.

下载PDF全文

下载文献需遵守相关版权规定

论文标题