探索融合策略以进行准确的RGBT视觉对象跟踪

论文标题

探索融合策略以进行准确的RGBT视觉对象跟踪

Exploring Fusion Strategies for Accurate RGBT Visual Object Tracking

论文作者

Tang, Zhangyong, Xu, Tianyang, Li, Hui, Wu, Xiao-Jun, Zhu, Xuefeng, Kittler, Josef

论文摘要

我们解决了视频中多模式对象跟踪的问题，并探索了各种选项，以融合可见光（RGB）和热红外（TIR）模式（包括像素级，功能级别，功能级别和决策级融合）的互补信息。具体而言，与现有方法不同，图像融合任务的范式在像素级别上被融合了。功能级融合可以通过可选的通道来实现。此外，在决策级别上，提出了一种新颖的融合策略，因为毫无轻松的平均配置表明了优越性。提议的决策级融合策略的有效性归功于许多创新的贡献，包括RGB和TIR贡献的动态权重以及线性模板更新操作。它的一种变体在2020年视觉对象跟踪挑战赛（fot-rgbt2020）上产生了获胜跟踪器。对创新像素和特征级融合策略的并发探索突出了拟议的决策级融合方法的优势。与最先进的方法相比，对三个具有挑战性的数据集（\ textit {i.e。}，GTOT，GTOT，GTOT，GTOT，GTOT，GTOT，GTOT，GTOT，GTOT，GTOT，GTOT，GTOT，GTOT，GTOT，GTOT，GTOT，GTOT，GTOT，GTOT，GTOT，GTOT，GTOT，GTOT，GTOT {I.E。}，广泛的实验结果与最先进的方法相比，这证明了该方法的有效性和鲁棒性。代码将在\ textColor {blue} {\ emph {https://github.com/zhangyong-tang/dfat}上共享。

We address the problem of multi-modal object tracking in video and explore various options of fusing the complementary information conveyed by the visible (RGB) and thermal infrared (TIR) modalities including pixel-level, feature-level and decision-level fusion. Specifically, different from the existing methods, paradigm of image fusion task is heeded for fusion at pixel level. Feature-level fusion is fulfilled by attention mechanism with channels excited optionally. Besides, at decision level, a novel fusion strategy is put forward since an effortless averaging configuration has shown the superiority. The effectiveness of the proposed decision-level fusion strategy owes to a number of innovative contributions, including a dynamic weighting of the RGB and TIR contributions and a linear template update operation. A variant of which produced the winning tracker at the Visual Object Tracking Challenge 2020 (VOT-RGBT2020). The concurrent exploration of innovative pixel- and feature-level fusion strategies highlights the advantages of the proposed decision-level fusion method. Extensive experimental results on three challenging datasets, \textit{i.e.}, GTOT, VOT-RGBT2019, and VOT-RGBT2020, demonstrate the effectiveness and robustness of the proposed method, compared to the state-of-the-art approaches. Code will be shared at \textcolor{blue}{\emph{https://github.com/Zhangyong-Tang/DFAT}.

下载PDF全文

下载文献需遵守相关版权规定

论文标题