论文标题
策略迭代RRT#的GPU并行化
GPU Parallelization of Policy Iteration RRT#
论文作者
论文摘要
鉴于其卓越的迅速探索高维配置空间的能力,基于抽样的计划已成为复杂机器人的事实上的标准。大多数现有的基于最佳抽样的计划算法本质上都是顺序的,无法利用现代计算机硬件上可用的广泛并行性。此外,这些算法中探索和剥削阶段的紧密同步限制了样品吞吐量和计划者的性能。策略迭代RRT#(PI-RRT#)在剥削阶段暴露了细粒度的并行性,但是尚未使用具体实现对此并行性进行评估。我们首先提出了PI-RRT#的开发阶段的新型GPU实现,并讨论了数据结构注意事项以最大化并行性能。我们的实施在串行PI-RRT#实施方面实现了3-4倍的加速,平均计划时间降低了77.9%。作为第二个贡献,我们介绍了批处理的扩展RRT#算法,该算法松开了Pi-Rrt#中存在的同步,以分别实现独立的12.97倍和12.54倍的串行和平行利用下的12.54倍加速度。
Sampling-based planning has become a de facto standard for complex robots given its superior ability to rapidly explore high-dimensional configuration spaces. Most existing optimal sampling-based planning algorithms are sequential in nature and cannot take advantage of wide parallelism available on modern computer hardware. Further, tight synchronization of exploration and exploitation phases in these algorithms limits sample throughput and planner performance. Policy Iteration RRT# (PI-RRT#) exposes fine-grained parallelism during the exploitation phase, but this parallelism has not yet been evaluated using a concrete implementation. We first present a novel GPU implementation of PI-RRT#'s exploitation phase and discuss data structure considerations to maximize parallel performance. Our implementation achieves 3-4x speedup over a serial PI-RRT# implementation for a 77.9% decrease in overall planning time on average. As a second contribution, we introduce the Batched-Extension RRT# algorithm, which loosens the synchronization present in PI-RRT# to realize independent 12.97x and 12.54x speedups under serial and parallel exploitation, respectively.