瓶装：在深神网络中学习压缩表示形式，以有效而有效的分裂计算

论文标题

瓶装：在深神网络中学习压缩表示形式，以有效而有效的分裂计算

BottleFit: Learning Compressed Representations in Deep Neural Networks for Effective and Efficient Split Computing

论文作者

Matsubara, Yoshitomo, Callegaro, Davide, Singh, Sameer, Levorato, Marco, Restuccia, Francesco

论文摘要

尽管关键任务应用需要使用深神网络（DNN），但它们在移动设备上的持续执行会导致能源消耗大幅增加。虽然边缘卸载可以减少能源消耗，但通道质量的不稳定模式，网络和边缘服务器负载会导致系统的关键操作严重中断。一种称为拆分计算的替代方法在模型中生成压缩表示形式（称为“瓶颈”），以减少带宽的使用和能耗。先前的工作提出了引入其他层的方法，以损害能耗和潜伏期。因此，我们提出了一个称为BotterFit的新框架，除了有针对性的DNN体系结构修改外，它还包括一种新颖的训练策略，即使使用强大的压缩率也可以实现高精度。我们在图像分类中应用瓶装fit在尖端的DNN模型上，并表明瓶装效果可达到77.1％的数据压缩，而ImageNet数据集的精度损失高达0.6％，而Spinn等最新技术的准确性最高为6％。我们通过实验测量在NVIDIA JETSON NANO板（基于GPU）和Raspberry Pi板（GPU-less）上运行的图像分类应用程序的功耗和延迟。我们表明，相对于（W.R.T.）本地计算，瓶装FIT分别将功耗和潜伏期分别降低了49％和89％，以及37％和55％W.R.T.边缘卸载。我们还将Botterfit与最先进的自动编码器的方法进行了比较，并表明（i）瓶装将功耗和执行时间分别减少了54％和44％，在Raspberry Pi上，瓶装和执行时间分别减少了44％和40％和62％；（ii）在移动设备上执行的头部模型的大小小于83倍。我们在本研究中发布了代码存储库以可重现结果。

Although mission-critical applications require the use of deep neural networks (DNNs), their continuous execution at mobile devices results in a significant increase in energy consumption. While edge offloading can decrease energy consumption, erratic patterns in channel quality, network and edge server load can lead to severe disruption of the system's key operations. An alternative approach, called split computing, generates compressed representations within the model (called "bottlenecks"), to reduce bandwidth usage and energy consumption. Prior work has proposed approaches that introduce additional layers, to the detriment of energy consumption and latency. For this reason, we propose a new framework called BottleFit, which, in addition to targeted DNN architecture modifications, includes a novel training strategy to achieve high accuracy even with strong compression rates. We apply BottleFit on cutting-edge DNN models in image classification, and show that BottleFit achieves 77.1% data compression with up to 0.6% accuracy loss on ImageNet dataset, while state of the art such as SPINN loses up to 6% in accuracy. We experimentally measure the power consumption and latency of an image classification application running on an NVIDIA Jetson Nano board (GPU-based) and a Raspberry PI board (GPU-less). We show that BottleFit decreases power consumption and latency respectively by up to 49% and 89% with respect to (w.r.t.) local computing and by 37% and 55% w.r.t. edge offloading. We also compare BottleFit with state-of-the-art autoencoders-based approaches, and show that (i) BottleFit reduces power consumption and execution time respectively by up to 54% and 44% on the Jetson and 40% and 62% on Raspberry PI; (ii) the size of the head model executed on the mobile device is 83 times smaller. We publish the code repository for reproducibility of the results in this study.

下载PDF全文

下载文献需遵守相关版权规定

论文标题