论文标题
在Edge TPU上寻找有效的神经体系结构中的eve evice ML
Searching for Efficient Neural Architectures for On-Device ML on Edge TPUs
论文作者
论文摘要
设备上的ML加速器已成为现代移动系统芯片(SOC)的标准。神经建筑搜索(NAS)进行了救援,以有效利用这些加速器提供的高计算吞吐量。但是,现有的NAS框架在扩展到多个任务和不同的目标平台方面有几个实际的限制。 In this work, we provide a two-pronged approach to this challenge: (i) a NAS-enabling infrastructure that decouples model cost evaluation, search space design, and the NAS algorithm to rapidly target various on-device ML tasks, and (ii) search spaces crafted from group convolution based inverted bottleneck (IBN) variants that provide flexible quality/performance trade-offs on ML accelerators, complementing the现有的基于全面卷积的IBN。使用这种方法,我们针对最先进的移动平台Google Tensor SoC,并展示神经体系结构,这些神经体系结构改善了各种计算机视觉(分类,检测,细分)以及自然语言处理任务的质量 - 性能帕累托前沿。
On-device ML accelerators are becoming a standard in modern mobile system-on-chips (SoC). Neural architecture search (NAS) comes to the rescue for efficiently utilizing the high compute throughput offered by these accelerators. However, existing NAS frameworks have several practical limitations in scaling to multiple tasks and different target platforms. In this work, we provide a two-pronged approach to this challenge: (i) a NAS-enabling infrastructure that decouples model cost evaluation, search space design, and the NAS algorithm to rapidly target various on-device ML tasks, and (ii) search spaces crafted from group convolution based inverted bottleneck (IBN) variants that provide flexible quality/performance trade-offs on ML accelerators, complementing the existing full and depthwise convolution based IBNs. Using this approach we target a state-of-the-art mobile platform, Google Tensor SoC, and demonstrate neural architectures that improve the quality-performance pareto frontier for various computer vision (classification, detection, segmentation) as well as natural language processing tasks.