H2H：具有计算和通信意识的异质系统映射的异质模型

论文标题

H2H：具有计算和通信意识的异质系统映射的异质模型

H2H: Heterogeneous Model to Heterogeneous System Mapping with Computation and Communication Awareness

论文作者

Zhang, Xinyi, Hao, Cong, Zhou, Peipei, Jones, Alex, Hu, Jingtong

论文摘要

现实世界中问题的复杂性质要求在机器学习（ML）模型和硬件系统中具有异质性。 ML模型中的异质性来自多传感器的感知和多任务学习，即多模式多任务多任务（MMMT），从而产生了不同的深神经网络（DNN）层和计算模式。系统中的异质性来自各种处理组件，因为它成为将多个专用加速器集成到一个系统中的流行方法。因此，出现了一个新问题：异质系统映射（H2H）的异质模型。尽管以前的映射算法主要集中在有效的计算上，但在这项工作中，我们认为同时考虑计算和通信以提高系统效率是必不可少的。我们提出了一种具有计算和通信意识的新型H2H映射算法。通过稍微交易的通信计算，可以大大减少系统的整体延迟和能耗。与现有的计算优先映射算法相比，根据大师建模评估我们工作的出色表现，证明了15％-74％的潜伏期降低和23％-64％的能量。

The complex nature of real-world problems calls for heterogeneity in both machine learning (ML) models and hardware systems. The heterogeneity in ML models comes from multi-sensor perceiving and multi-task learning, i.e., multi-modality multi-task (MMMT), resulting in diverse deep neural network (DNN) layers and computation patterns. The heterogeneity in systems comes from diverse processing components, as it becomes the prevailing method to integrate multiple dedicated accelerators into one system. Therefore, a new problem emerges: heterogeneous model to heterogeneous system mapping (H2H). While previous mapping algorithms mostly focus on efficient computations, in this work, we argue that it is indispensable to consider computation and communication simultaneously for better system efficiency. We propose a novel H2H mapping algorithm with both computation and communication awareness; by slightly trading computation for communication, the system overall latency and energy consumption can be largely reduced. The superior performance of our work is evaluated based on MAESTRO modeling, demonstrating 15%-74% latency reduction and 23%-64% energy reduction compared with existing computation-prioritized mapping algorithms.

下载PDF全文

下载文献需遵守相关版权规定

论文标题