论文标题
chretile:有效的CNN处理的稀疏张量
GrateTile: Efficient Sparse Tensor Tiling for CNN Processing
论文作者
论文摘要
我们建议使用稀疏CNN特征图(激活)的高效,硬件友好的数据存储方案。它将数据划分为不均匀的子监视器,并用小索引开销,以压缩但随机访问的格式存储。该设计使现代CNN加速器能够以瓷砖的处理方式在fly上获取和解压缩子调整。 Gratetile适用于有利于对齐,合并的数据访问的架构,并且仅需要对整个建筑设计的更改。我们使用最先进的CNN模拟attretile,平均显示55%的DRAM带宽减少,同时仅使用0.6%的功能映射大小进行索引存储。
We propose GrateTile, an efficient, hardwarefriendly data storage scheme for sparse CNN feature maps (activations). It divides data into uneven-sized subtensors and, with small indexing overhead, stores them in a compressed yet randomly accessible format. This design enables modern CNN accelerators to fetch and decompressed sub-tensors on-the-fly in a tiled processing manner. GrateTile is suitable for architectures that favor aligned, coalesced data access, and only requires minimal changes to the overall architectural design. We simulate GrateTile with state-of-the-art CNNs and show an average of 55% DRAM bandwidth reduction while using only 0.6% of feature map size for indexing storage.