迈向具有模型修剪和编译器优化的移动平台上的实时DNN推断

论文标题

迈向具有模型修剪和编译器优化的移动平台上的实时DNN推断

Towards Real-Time DNN Inference on Mobile Platforms with Model Pruning and Compiler Optimization

论文作者

Niu, Wei, Zhao, Pu, Zhan, Zheng, Lin, Xue, Wang, Yanzhi, Ren, Bin

论文摘要

高端移动平台迅速用作广泛的深神经网络（DNN）应用程序的主要计算设备。但是，这些设备上的限制计算和存储资源对于实时DNN推理执行构成了重大挑战。为了解决这个问题，我们提出了一组硬件友好的结构化模型修剪和编译器优化技术，以加速移动设备上的DNN执行。该演示表明，这些优化可以实现多个DNN应用程序的实时移动执行，包括样式传输，DNN着色和超级分辨率。

High-end mobile platforms rapidly serve as primary computing devices for a wide range of Deep Neural Network (DNN) applications. However, the constrained computation and storage resources on these devices still pose significant challenges for real-time DNN inference executions. To address this problem, we propose a set of hardware-friendly structured model pruning and compiler optimization techniques to accelerate DNN executions on mobile devices. This demo shows that these optimizations can enable real-time mobile execution of multiple DNN applications, including style transfer, DNN coloring and super resolution.

下载PDF全文

下载文献需遵守相关版权规定

论文标题