建模成对以自我为中心互动识别的跨视图互动一致性

论文标题

建模成对以自我为中心互动识别的跨视图互动一致性

Modeling Cross-view Interaction Consistency for Paired Egocentric Interaction Recognition

论文作者

Li, Zhongguo, Lyu, Fan, Feng, Wei, Wang, Song

论文摘要

随着增强现实（AR）的发展，以自我为中心的行动识别（EAR）在准确理解用户的需求中起着重要作用。但是，EAR旨在帮助识别单个以自我为中心的视图中的人机相互作用，因此很难捕获两个面对面的AR用户之间的相互作用。配对的以自我为中心的互动识别（PEIR）是在相应视图中协作识别两个人与视频之间的互动的任务。不幸的是，现有的PEIR方法始终直接使用线性决策功能来融合从两个相应的以自我为中心视频中提取的功能，这些功能忽略了配对的自我中心视频中相互作用的一致性。配对视频中相互作用的一致性以及从它们中提取的功能相互关联。最重要的是，我们建议使用Biliear Pooling在两个视图之间构建相关性，从而捕获了特征级别的两个视图的一致性。具体而言，从一个视图的特征图中的每个神经元从另一种视图连接到神经元，这可以确保两个视图之间的紧凑一致性。然后，所有可能的配对神经元用于PEIR，用于内部一致的信息。为了提高效率，我们使用Count Sketch使用紧凑的双线性合并，以避免在双线性中直接计算外部产品。数据集PEV上的实验结果显示了在任务PEIR上提出的方法的优越性。

With the development of Augmented Reality (AR), egocentric action recognition (EAR) plays important role in accurately understanding demands from the user. However, EAR is designed to help recognize human-machine interaction in single egocentric view, thus difficult to capture interactions between two face-to-face AR users. Paired egocentric interaction recognition (PEIR) is the task to collaboratively recognize the interactions between two persons with the videos in their corresponding views. Unfortunately, existing PEIR methods always directly use linear decision function to fuse the features extracted from two corresponding egocentric videos, which ignore consistency of interaction in paired egocentric videos. The consistency of interactions in paired videos, and features extracted from them are correlated to each other. On top of that, we propose to build the relevance between two views using biliear pooling, which capture the consistency of two views in feature-level. Specifically, each neuron in the feature maps from one view connects to the neurons from another view, which guarantee the compact consistency between two views. Then all possible paired neurons are used for PEIR for the inside consistent information of them. To be efficient, we use compact bilinear pooling with Count Sketch to avoid directly computing outer product in bilinear. Experimental results on dataset PEV shows the superiority of the proposed methods on the task PEIR.

下载PDF全文

下载文献需遵守相关版权规定

论文标题