论文标题
HIPACCVX:OpenVX和DSL代码生成的婚礼
HipaccVX: Wedding of OpenVX and DSL-based Code Generation
论文作者
论文摘要
为高性能优化的异质平台编写程序非常困难,因为这需要与大多数情况下基于根本上不同的编程范式和语言的特定时间进行架构特定的优化来调整代码。 OpenVX有望针对具有基于图形执行模型的免版税行业标准的计算机视觉应用程序解决此问题。但是,OpenVX的算法空间被限制在一小部分视觉函数上。这会阻碍标准未包含的加速计算。 在本文中,我们分析了OpenVX视觉函数,以找到一组正交的计算抽象集。基于这些抽象,我们将现有的特定域语言(DSL)的后端与OpenVX环境相结合,并为程序员提供语言构造以定义用户定义的节点。通过这种方式,我们启用了使用标准计算机视觉函数使用OpenVX图形实现来检测的优化。这些优化可以使NVIDIA GTX GPU上的吞吐量增加一倍,并将Xilinx Zynq FPGA的资源使用减少50%。最后,我们表明我们提出的称为HIPACCVX的编译器框架可以比最先进的Nvidia VisionWorks和Halide-HLS获得更好的结果。
Writing programs for heterogeneous platforms optimized for high performance is hard since this requires the code to be tuned at a low level with architecture-specific optimizations that are most times based on fundamentally differing programming paradigms and languages. OpenVX promises to solve this issue for computer vision applications with a royalty-free industry standard that is based on a graph-execution model. Yet, the OpenVX' algorithm space is constrained to a small set of vision functions. This hinders accelerating computations that are not included in the standard. In this paper, we analyze OpenVX vision functions to find an orthogonal set of computational abstractions. Based on these abstractions, we couple an existing Domain-Specific Language (DSL) back end to the OpenVX environment and provide language constructs to the programmer for the definition of user-defined nodes. In this way, we enable optimizations that are not possible to detect with OpenVX graph implementations using the standard computer vision functions. These optimizations can double the throughput on an Nvidia GTX GPU and decrease the resource usage of a Xilinx Zynq FPGA by 50% for our benchmarks. Finally, we show that our proposed compiler framework, called HipaccVX, can achieve better results than the state-of-the-art approaches Nvidia VisionWorks and Halide-HLS.