与高水平合成编译的内存应用程序的分析模型

论文标题

与高水平合成编译的内存应用程序的分析模型

Analytical Model of Memory-Bound Applications Compiled with High Level Synthesis

论文作者

Dávila-Guzmán, Maria A., Tejero, Rubén Gran, Villarroya-Gaudó, María, Gracia, Darío Suárez

论文摘要

专门的加速器对提高能源效率和性能的需求不断增长，这突显了FPGA作为交付两者的有前途选择。但是，在硬件说明语言中编程FPGA需要长时间和精力来实现最佳结果，这使许多程序员不采用该技术。高级合成工具改善了FPGA的可访问性，但是由于汇编时间（分钟和天之间）产生单个bitstream所需的大量汇编时间，优化过程仍然很昂贵。在大部分时间内放置和路由占用，但RTL管道和内存组织在几秒钟内已知。有关即将到来的Bortstream组织组织的早期信息足以提供准确，快速的性能模型。本文介绍了针对内存绑定应用程序的HLS设计的性能分析模型。通过对生成的内存体系结构和DRAM组织进行仔细的分析，该模型可预测一组代表性应用程序的最大误差为9.2％。与以前的工作相比，我们的预测平均减少了至少$ 2 \ tims $ $估计错误。

The increasing demand of dedicated accelerators to improve energy efficiency and performance has highlighted FPGAs as a promising option to deliver both. However, programming FPGAs in hardware description languages requires long time and effort to achieve optimal results, which discourages many programmers from adopting this technology. High Level Synthesis tools improve the accessibility to FPGAs, but the optimization process is still time expensive due to the large compilation time, between minutes and days, required to generate a single bitstream. Whereas placing and routing take most of this time, the RTL pipeline and memory organization are known in seconds. This early information about the organization of the upcoming bitstream is enough to provide an accurate and fast performance model. This paper presents a performance analytical model for HLS designs focused on memory bound applications. With a careful analysis of the generated memory architecture and DRAM organization, the model predicts the execution time with a maximum error of 9.2% for a set of representative applications. Compared with previous works, our predictions reduce on average at least $2\times$ the estimation error.

下载PDF全文

下载文献需遵守相关版权规定

论文标题