论文标题
使用间隔分割和BRAM实例化在FPGA上有效基于表的函数近似
Efficient Table-based Function Approximation on FPGAs using Interval Splitting and BRAM Instantiation
论文作者
论文摘要
本文提出了一种新的方法,用于生成FPGA的基于记忆有效的函数近似电路。给定的函数f(x)要在给定的间隔[x0,x0+a]和最大近似误差ea中近似,其目标是确定具有最小的内存足迹的函数表实现,即需要存储的条目数量。 Rather than state-of-the-art work performing an even sampling of the given interval by so-called breakpoints and using linear interpolation between two adjacent breakpoints to determine f(x) at the maximum error bound, first, we propose three interval-splitting algorithms to reduce the required memory footprint drastically based on the observation that in sub-intervals of low gradient, a coarser sampling grid may be assumed to satisfy the maximum插值误差绑定。基础数学功能的实验表明,可以保存内存足迹中的大部分。其次,引入了一个硬件体系结构,以仅在9个时钟周期的延迟下实现次互助选择,断点查找和插值。第三,在每个生成的电路设计中,BRAM是自动实例化的,而不是使用LUT原始图来综合降低的足迹函数表,从而提供了额外的资源效率。
This paper proposes a novel approach for the generation of memory-efficient table-based function approximation circuits for FPGAs. Given a function f(x) to be approximated in a given interval [x0,x0+a] and a maximum approximation error Ea, the goal is to determine a function table implementation with a minimized memory footprint, i.e., number of entries that need to be stored. Rather than state-of-the-art work performing an even sampling of the given interval by so-called breakpoints and using linear interpolation between two adjacent breakpoints to determine f(x) at the maximum error bound, first, we propose three interval-splitting algorithms to reduce the required memory footprint drastically based on the observation that in sub-intervals of low gradient, a coarser sampling grid may be assumed to satisfy the maximum interpolation error bound. Experiments on elementary mathematical functions show that a large fraction in memory footprint may be saved. Second, a hardware architecture implementing the sub-interval selection, breakpoint lookup and interpolation at a latency of just 9 clock cycles is introduced. Third, within each generated circuit design, BRAMs are automatically instantiated rather than synthesizing the reduced footprint function table using LUT primitives providing an additional degree of resource efficiency.