驱动：搜索多尺度分支聚合以估计姿势估计

论文标题

驱动：搜索多尺度分支聚合以估计姿势估计

AutoPose: Searching Multi-Scale Branch Aggregation for Pose Estimation

论文作者

Gong, Xinyu, Chen, Wuyang, Jiang, Yifan, Yuan, Ye, Liu, Xianming, Zhang, Qian, Li, Yuan, Wang, Zhangyang

论文摘要

我们提出了一种新型的神经体系结构搜索（NAS）框架，能够自动发现跨尺度连接的多个平行分支，以朝着准确和高分辨率的2D人体姿势估计。最近，用于姿势估计的高性能手工制作的卷积网络表明，对多尺度融合和高分辨率表示的需求不断增长。但是，当前的NAS作品在规模搜索方面表现出有限的灵活性，它们主要采用了单支架构的简化搜索空间。这种简化限制了在不同尺度上的信息融合，并且无法维持高分辨率表示。除了细胞级的微观结构外，EsshitedAutopose框架还能够搜索多分支量表和网络深度。由搜索空间激励，提出了一种新颖的双层优化方法，其中通过增强学习搜索了网络级体系结构，并且通过基于梯度的方法进行细胞级搜索。在2.5 GPU天内，摩托车能够在MS Coco数据集上找到非常有竞争力的架构，这些架构也可以传输到MPII数据集。我们的代码可在https://github.com/vita-group/autopose上找到。

We present AutoPose, a novel neural architecture search(NAS) framework that is capable of automatically discovering multiple parallel branches of cross-scale connections towards accurate and high-resolution 2D human pose estimation. Recently, high-performance hand-crafted convolutional networks for pose estimation show growing demands on multi-scale fusion and high-resolution representations. However, current NAS works exhibit limited flexibility on scale searching, they dominantly adopt simplified search spaces of single-branch architectures. Such simplification limits the fusion of information at different scales and fails to maintain high-resolution representations. The presentedAutoPose framework is able to search for multi-branch scales and network depth, in addition to the cell-level microstructure. Motivated by the search space, a novel bi-level optimization method is presented, where the network-level architecture is searched via reinforcement learning, and the cell-level search is conducted by the gradient-based method. Within 2.5 GPU days, AutoPose is able to find very competitive architectures on the MS COCO dataset, that are also transferable to the MPII dataset. Our code is available at https://github.com/VITA-Group/AutoPose.

下载PDF全文

下载文献需遵守相关版权规定

论文标题