论文标题

通过时间GPU - 间压阵列积分平衡DNN加速的效率和灵活性

Balancing Efficiency and Flexibility for DNN Acceleration via Temporal GPU-Systolic Array Integration

论文作者

Guo, Cong, Zhou, Yangjie, Leng, Jingwen, Zhu, Yuhao, Du, Zidong, Chen, Quan, Li, Chao, Yao, Bin, Guo, Minyi

论文摘要

由于其卓越的性能和效率,对深度神经网络(DNN)的专业硬件加速器(DNN)的研究兴趣。但是,当今的DNN加速器主要集中于加速特定的“内核”,例如卷积和矩阵乘法,这是至关重要的,但仅是启用端到端DNN的应用程序的一部分。在整个应用程序上的有意义的加速通常需要支持的计算,尽管非常平行,但不适合DNN加速器。整合通用处理器(例如CPU或GPU)会导致大量数据移动开销,并导致对DNN加速器的利用不足。 我们提出了同时的多模式体系结构(SMA),这是一种新颖的体系结构设计和执行模型,可在DNN加速器上提供通用的可编程性,以便加速端到端的应用程序。 SMA的关键是与类似GPU的SIMD执行模型的收缩期执行模型的时间集成。 SMA利用了收缩 - 阵列加速器和GPU之间共享的共同组件,并提供了轻巧的重新配置能力,可在两种模式之间切换。 SMA的性能提高了63%,而消耗的能量比带有TensorCore的基线伏尔时体系结构少23%。

The research interest in specialized hardware accelerators for deep neural networks (DNN) spikes recently owing to their superior performance and efficiency. However, today's DNN accelerators primarily focus on accelerating specific "kernels" such as convolution and matrix multiplication, which are vital but only part of an end-to-end DNN-enabled application. Meaningful speedups over the entire application often require supporting computations that are, while massively parallel, ill-suited to DNN accelerators. Integrating a general-purpose processor such as a CPU or a GPU incurs significant data movement overhead and leads to resource under-utilization on the DNN accelerators. We propose Simultaneous Multi-mode Architecture (SMA), a novel architecture design and execution model that offers general-purpose programmability on DNN accelerators in order to accelerate end-to-end applications. The key to SMA is the temporal integration of the systolic execution model with the GPU-like SIMD execution model. The SMA exploits the common components shared between the systolic-array accelerator and the GPU, and provides lightweight reconfiguration capability to switch between the two modes in-situ. The SMA achieves up to 63% performance improvement while consuming 23% less energy than the baseline Volta architecture with TensorCore.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源