自动驾驶的语义细分：模型评估，数据集生成，观点比较和实时功能

论文标题

自动驾驶的语义细分：模型评估，数据集生成，观点比较和实时功能

Semantic Segmentation for Autonomous Driving: Model Evaluation, Dataset Generation, Perspective Comparison, and Real-Time Capability

论文作者

Cakir, Senay, Gauß, Marcel, Häppeler, Kai, Ounajjar, Yassine, Heinle, Fabian, Marchthaler, Reiner

论文摘要

环境感知是自动驾驶汽车领域中的一个重要方面，它提供了有关驾驶领域的重要信息，包括但不限于确定清晰的驾驶区域和周围的障碍。语义分割是一种用于自动驾驶汽车的广泛使用的感知方法，它将图像的每个像素与预定义的类相关联。在这种情况下，评估了几个分割模型有关准确性和效率。生成数据集的实验结果确认，更快的分割模型足够快，可以实时在自动驾驶汽车中的低力计算（嵌入式）设备上使用。还引入了一种简单的方法来为模型生成合成训练数据。此外，比较了第一人称视角的准确性和鸟类视野的观点。从第一人称角度来看，对于$ 320 \ times 256 $输入，porperseg $ 65.44 \，\％$均值均值与联合（MIOU）相交的交叉点，以及从鸟类的眼睛的角度来看，$ 320 \ times 256 $输入，forpective，fore fore fore fore fore fore fore，forpectegsegabastersegabastersegabastersegabastersegabastersegabastersegeves aftersegeves aftersegeves aftersegeves aftersegeves aftersegeves $ 64.08 \ $ $ miou $ miou \％$ miou。这两种观点都达到了Nvidia Jetson Agx Xavier的每秒$ 247.11 $帧的帧速率。最后，测量并比较目标硬件的算术率和相对于算术16位浮点（FP16）和32位浮点（FP32）的精度。

Environmental perception is an important aspect within the field of autonomous vehicles that provides crucial information about the driving domain, including but not limited to identifying clear driving areas and surrounding obstacles. Semantic segmentation is a widely used perception method for self-driving cars that associates each pixel of an image with a predefined class. In this context, several segmentation models are evaluated regarding accuracy and efficiency. Experimental results on the generated dataset confirm that the segmentation model FasterSeg is fast enough to be used in realtime on lowpower computational (embedded) devices in self-driving cars. A simple method is also introduced to generate synthetic training data for the model. Moreover, the accuracy of the first-person perspective and the bird's eye view perspective are compared. For a $320 \times 256$ input in the first-person perspective, FasterSeg achieves $65.44\,\%$ mean Intersection over Union (mIoU), and for a $320 \times 256$ input from the bird's eye view perspective, FasterSeg achieves $64.08\,\%$ mIoU. Both perspectives achieve a frame rate of $247.11$ Frames per Second (FPS) on the NVIDIA Jetson AGX Xavier. Lastly, the frame rate and the accuracy with respect to the arithmetic 16-bit Floating Point (FP16) and 32-bit Floating Point (FP32) of both perspectives are measured and compared on the target hardware.

下载PDF全文

下载文献需遵守相关版权规定

论文标题