远处：自我监督的精炼多对象跟踪和分割

论文标题

远处：自我监督的精炼多对象跟踪和分割

ReMOTS: Self-Supervised Refining Multi-Object Tracking and Segmentation

论文作者

Yang, Fan, Chang, Xin, Dang, Chenyu, Zheng, Ziqiang, Sakti, Sakriani, Nakamura, Satoshi, Wu, Yang

论文摘要

我们旨在通过完善来提高多个对象跟踪和细分（MOT）的性能。但是，精炼MOTS结果仍然具有挑战性，这可能归因于该外观特征并不适用于目标视频，并且很难找到适当的阈值来区分它们。为了解决这个问题，我们提出了一个自我监督的精炼MOT（即远方）框架。 Enlots主要采取四个步骤来从数据关联角度提高MOTS结果。（1）使用预测的口罩训练外观编码器。（2）将相邻框架的观测值关联以形成短期曲目。（3）使用短期曲目作为可靠的伪标签训练外观编码器。（4）将短期曲目合并到长期轨道上，利用采用的外观特征和阈值自动从统计信息获得。使用远方，我们在CVPR 2020 MOTS Challenge 1上达到了$ 1^{ST} $，SMOTSA分数为$ 69.9 $。

We aim to improve the performance of Multiple Object Tracking and Segmentation (MOTS) by refinement. However, it remains challenging for refining MOTS results, which could be attributed to that appearance features are not adapted to target videos and it is also difficult to find proper thresholds to discriminate them. To tackle this issue, we propose a self-supervised refining MOTS (i.e., ReMOTS) framework. ReMOTS mainly takes four steps to refine MOTS results from the data association perspective. (1) Training the appearance encoder using predicted masks. (2) Associating observations across adjacent frames to form short-term tracklets. (3) Training the appearance encoder using short-term tracklets as reliable pseudo labels. (4) Merging short-term tracklets to long-term tracklets utilizing adopted appearance features and thresholds that are automatically obtained from statistical information. Using ReMOTS, we reached the $1^{st}$ place on CVPR 2020 MOTS Challenge 1, with an sMOTSA score of $69.9$.

下载PDF全文

下载文献需遵守相关版权规定

论文标题