论文标题
在私人Aiot设备上量身定制的模型:联合直接神经建筑搜索
Towards Tailored Models on Private AIoT Devices: Federated Direct Neural Architecture Search
论文作者
论文摘要
在部署边缘设备时,神经网络通常会遇到各种严格的资源约束。为了通过减少人为的努力来解决这些问题,自动化的机器学习在寻找适合各种事物(Aiot)场景的各种神经体系结构方面变得很流行。最近,为了防止私人信息的泄漏,同时启用了自动化机器智能,这是一个新兴的趋势来集成联合学习和神经建筑搜索(NAS)。尽管看起来很有希望,但两个原则的困难耦合使算法的发展变得非常具有挑战性。特别是,如何直接从AIOT设备之间的大规模非独立和相同分布的(非IID)数据直接从联邦设备之间进行有效搜索最佳神经体系结构是很难破裂的坚果。在本文中,为了解决这一挑战,通过利用SercoxylessNA的进步,我们提出了一个联合的直接神经体系结构搜索(FDNAS)框架,该框架允许从设备跨设备的非IID数据中提供硬件友好的NAS。为了进一步适应各种数据分布和不同类型的设备,具有异质性嵌入式硬件平台,受到元学习的启发,提出了一个集群联合的直接神经体系结构搜索(CFDNAS)框架的集群,以实现设备感知的NAS,以使每个设备都可以学习量身定制的深度学习模型,以实现其特定的数据分布和硬件分布。非IID数据集的广泛实验表明,在存在数据和设备异质性的情况下,提出的解决方案实现了最先进的准确性效率权衡。
Neural networks often encounter various stringent resource constraints while deploying on edge devices. To tackle these problems with less human efforts, automated machine learning becomes popular in finding various neural architectures that fit diverse Artificial Intelligence of Things (AIoT) scenarios. Recently, to prevent the leakage of private information while enable automated machine intelligence, there is an emerging trend to integrate federated learning and neural architecture search (NAS). Although promising as it may seem, the coupling of difficulties from both tenets makes the algorithm development quite challenging. In particular, how to efficiently search the optimal neural architecture directly from massive non-independent and identically distributed (non-IID) data among AIoT devices in a federated manner is a hard nut to crack. In this paper, to tackle this challenge, by leveraging the advances in ProxylessNAS, we propose a Federated Direct Neural Architecture Search (FDNAS) framework that allows for hardware-friendly NAS from non- IID data across devices. To further adapt to both various data distributions and different types of devices with heterogeneous embedded hardware platforms, inspired by meta-learning, a Cluster Federated Direct Neural Architecture Search (CFDNAS) framework is proposed to achieve device-aware NAS, in the sense that each device can learn a tailored deep learning model for its particular data distribution and hardware constraint. Extensive experiments on non-IID datasets have shown the state-of-the-art accuracy-efficiency trade-offs achieved by the proposed solution in the presence of both data and device heterogeneity.