论文标题

邻居之间的凸组合一致性,用于弱监督行动的定位

Convex Combination Consistency between Neighbors for Weakly-supervised Action Localization

论文作者

Liu, Qinying, Wang, Zilei, Chen, Ruoxi, Li, Zhilin

论文摘要

弱监督的时间动作定位(WTAL)打算仅通过弱监督(例如视频级标签)来检测行动实例。当前的〜\ textit {de facto}管道通过在时间类激活序列上对阈值和分组连续的高分区域来定位动作实例。在这条路线中,模型识别相邻摘要之间关系的能力至关重要,这决定了动作边界的质量。但是,由于相邻摘要之间的变化通常是微妙的,但不幸的是,这在文献中被忽略了。为了解决这个问题,我们提出了一种新颖的WTAL方法,称为邻居之间的凸组合一致性(c $^3 $ bn)。 c $^3 $ bn由两种关键要素组成:一种微数据增强策略,通过相邻摘要的凸组合增加了相邻摘要之间的多样性,以及宏观微薄的一致性正规化,可以强制实施该模型,以使模型不变,以使变换〜\ textit〜\ textit〜\ textit {W.R.T.因此,探索了相邻片段之间的细粒模式进行探索,从而导致更健壮的动作边界定位。实验结果证明了C $^3 $ bn在WTAL的各种基线以及视频级别和点级的主管上的有效性。代码在https://github.com/qinying-liu/c3bn上。

Weakly-supervised temporal action localization (WTAL) intends to detect action instances with only weak supervision, e.g., video-level labels. The current~\textit{de facto} pipeline locates action instances by thresholding and grouping continuous high-score regions on temporal class activation sequences. In this route, the capacity of the model to recognize the relationships between adjacent snippets is of vital importance which determines the quality of the action boundaries. However, it is error-prone since the variations between adjacent snippets are typically subtle, and unfortunately this is overlooked in the literature. To tackle the issue, we propose a novel WTAL approach named Convex Combination Consistency between Neighbors (C$^3$BN). C$^3$BN consists of two key ingredients: a micro data augmentation strategy that increases the diversity in-between adjacent snippets by convex combination of adjacent snippets, and a macro-micro consistency regularization that enforces the model to be invariant to the transformations~\textit{w.r.t.} video semantics, snippet predictions, and snippet representations. Consequently, fine-grained patterns in-between adjacent snippets are enforced to be explored, thereby resulting in a more robust action boundary localization. Experimental results demonstrate the effectiveness of C$^3$BN on top of various baselines for WTAL with video-level and point-level supervisions. Code is at https://github.com/Qinying-Liu/C3BN.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源