EBBINNOT：用于固定动态视觉传感器的硬件有效的混合事件框架跟踪器

论文标题

EBBINNOT：用于固定动态视觉传感器的硬件有效的混合事件框架跟踪器

EBBINNOT: A Hardware Efficient Hybrid Event-Frame Tracker for Stationary Dynamic Vision Sensors

论文作者

Mohan, Vivek, Singla, Deepak, Pulluri, Tarun, Ussa, Andres, Gopalakrishnan, Pradeep Kumar, Sun, Pao-Sheng, Ramesh, Bharath, Basu, Arindam

论文摘要

作为替代性传感范式，最近探索了动态视觉传感器（DVS），以解决传统传感器导致高数据速率和处理时间的情况。本文提出了一种混合事件框架方法，用于检测和跟踪固定神经形态传感器记录的对象，从而在低功率设置中利用稀疏DVS输出进行交通监视。具体来说，我们提出了一个有效的硬件处理管道，以优化内存和计算需求，从而为物联网应用程序提供长期电池供电。为了利用静态DVS的背景删除属性，我们提出了一个基于事件的二进制图像创建，该创建信号在框架持续时间内信号或不存在事件。这减少了内存需求，并能够分别使用简单算法（例如中值过滤和连接的组件标记）和区域建议。为了克服碎片化问题，已经提出了YOLO启发的基于神经网络的检测器和分类器以合并零散的区域建议。最后，实施了一种新的基于重叠的跟踪器，并通过启发式方法提出了检测和轨道之间的重叠以克服闭塞。评估了所提出的管道，以超过5个小时的交通记录，涵盖了两个不同的神经形态传感器（DVS和CELEX）上的三个不同位置，并显示出相似的性能。与现有的基于事件的功能跟踪器相比，我们的方法提供了相似的精度，同时需要减少约6倍的计算。据我们所知，这是与同时记录的基于RGB框架的方法相比，这是第一次进行基于DVS的固定流量监控解决方案，同时通过胜过最先进的深度学习解决方案，表现出巨大的希望。

As an alternative sensing paradigm, dynamic vision sensors (DVS) have been recently explored to tackle scenarios where conventional sensors result in high data rate and processing time. This paper presents a hybrid event-frame approach for detecting and tracking objects recorded by a stationary neuromorphic sensor, thereby exploiting the sparse DVS output in a low-power setting for traffic monitoring. Specifically, we propose a hardware efficient processing pipeline that optimizes memory and computational needs that enable long-term battery powered usage for IoT applications. To exploit the background removal property of a static DVS, we propose an event-based binary image creation that signals presence or absence of events in a frame duration. This reduces memory requirement and enables usage of simple algorithms like median filtering and connected component labeling for denoise and region proposal respectively. To overcome the fragmentation issue, a YOLO inspired neural network based detector and classifier to merge fragmented region proposals has been proposed. Finally, a new overlap based tracker was implemented, exploiting overlap between detections and tracks is proposed with heuristics to overcome occlusion. The proposed pipeline is evaluated with more than 5 hours of traffic recording spanning three different locations on two different neuromorphic sensors (DVS and CeleX) and demonstrate similar performance. Compared to existing event-based feature trackers, our method provides similar accuracy while needing approx 6 times less computes. To the best of our knowledge, this is the first time a stationary DVS based traffic monitoring solution is extensively compared to simultaneously recorded RGB frame-based methods while showing tremendous promise by outperforming state-of-the-art deep learning solutions.

下载PDF全文

下载文献需遵守相关版权规定

论文标题