论文标题
估计时间动作检测可靠提案质量
Estimation of Reliable Proposal Quality for Temporal Action Detection
论文作者
论文摘要
时间动作检测(TAD)旨在定位和识别未修剪视频中的动作。无锚方法取得了显着的进步,主要将TAD提出为两个任务:使用两个单独的分支进行分类和定位。本文揭示了这两个任务之间的时间误差,阻碍了进一步的进展。为了解决这个问题,我们提出了一种新方法,该方法可以通过获得可靠的建议质量来同时了解时刻和地区观点,以使这两个任务对齐。目前,设计了边界评估模块(BEM),该模块(BEM)着重于局部外观和运动演变,以估算边界质量,并采用多尺度的方式来处理各种动作持续时间。对于该区域的观点,我们介绍了区域评估模块(REM),该模块(REM)使用一种新的有效的采样方法来提出特征表示,其中包含更多的上下文信息与点特征与精炼类别分数和建议分数和建议边界相比。提出的边界评估模块和区域评估模块(BREM)是通用的,并且可以轻松地与其他无锚固tad方法集成以实现出色的性能。在我们的实验中,BREM与两个不同的框架相结合,并将Thumos14的性能分别提高了3.6%和1.0%,达到了新的最新面积(平均地图为63.6%)。同时,在ActivityNet-1.3上获得平均图36.2%的竞争结果,而BREM持续改善。这些代码在https://github.com/junshan233/brem上发布。
Temporal action detection (TAD) aims to locate and recognize the actions in an untrimmed video. Anchor-free methods have made remarkable progress which mainly formulate TAD into two tasks: classification and localization using two separate branches. This paper reveals the temporal misalignment between the two tasks hindering further progress. To address this, we propose a new method that gives insights into moment and region perspectives simultaneously to align the two tasks by acquiring reliable proposal quality. For the moment perspective, Boundary Evaluate Module (BEM) is designed which focuses on local appearance and motion evolvement to estimate boundary quality and adopts a multi-scale manner to deal with varied action durations. For the region perspective, we introduce Region Evaluate Module (REM) which uses a new and efficient sampling method for proposal feature representation containing more contextual information compared with point feature to refine category score and proposal boundary. The proposed Boundary Evaluate Module and Region Evaluate Module (BREM) are generic, and they can be easily integrated with other anchor-free TAD methods to achieve superior performance. In our experiments, BREM is combined with two different frameworks and improves the performance on THUMOS14 by 3.6% and 1.0% respectively, reaching a new state-of-the-art (63.6% average mAP). Meanwhile, a competitive result of 36.2% average mAP is achieved on ActivityNet-1.3 with the consistent improvement of BREM. The codes are released at https://github.com/Junshan233/BREM.