论文标题
具有高级可自定义IP的FPGA上的特定应用程序覆盖
Building Application-Specific Overlays on FPGAs with High-Level Customizable IPs
论文作者
论文摘要
叠加层是虚拟的,可重新配置的架构,它们在物理FPGA织物之上覆盖。专门用于应用程序或一类应用程序的覆盖层提供快速的重新配置和最小化的绩效惩罚。这样的覆盖层通常由硬件设计人员在寄存器转移级别(RTL)的硬件“汇编”语言中实现。 这篇简短的文章为软件程序员而不是硬件设计人员提出了一个想法,以快速使用高级可自定义IPS来快速实现特定于应用程序的覆盖。这些IP用规格语言简洁地表达,其抽象级别远高于RTL,但仍可以表达许多针对FPGA的性能至关重要的循环和数据优化,因此可以以低得多的维护成本提供竞争力的高性能,并且更容易定制。 我们提出了新的语言功能,可以轻松将IP放在覆盖层中。编译器会自动实现指定的优化以生成有效的覆盖层,揭示覆盖层的多任务编程接口,并插入运行时调度程序以进行调度任务以在覆盖的IPS上运行,以尊重任务之间的依赖。虽然用任何语言编写的应用程序可以通过编程接口来利用覆盖层,但我们显示了一种特定的用法方案,其中应用程序本身也用相同的语言简洁地指定。 我们描述了表达覆盖层的新语言功能,并用LU分解器和卷积神经网络说明了这些功能。正在建设一个系统来实施语言功能和工作负载。
Overlays are virtual, re-configurable architectures that overlay on top of physical FPGA fabrics. An overlay that is specialized for an application, or a class of applications, offers both fast reconfiguration and minimized performance penalty. Such an overlay is usually implemented by hardware designers in hardware "assembly" languages at register-transfer level (RTL). This short article proposes an idea for a software programmer, instead of hardware designers, to quickly implement an application-specific overlay using high-level customizable IPs. These IPs are expressed succinctly by a specification language, whose abstraction level is much higher than RTL but can nonetheless expresses many performance-critical loop and data optimizations on FPGAs, and thus would offer competitively high performance at a much lower cost of maintenance and much easier customizations. We propose new language features to easily put the IPs together into an overlay. A compiler automatically implements the specified optimizations to generate an efficient overlay, exposes a multi-tasking programming interface for the overlay, and inserts a runtime scheduler for scheduling tasks to run on the IPs of the overlay, respecting the dependences between the tasks. While an application written in any language can take advantage of the overlay through the programming interface, we show a particular usage scenario, where the application itself is also succinctly specified in the same language. We describe the new language features for expressing overlays, and illustrate the features with an LU decomposer and a convolutional neural network. A system is under construction to implement the language features and workloads.