流媒体视频中的标签有效的在线连续对象检测

论文标题

流媒体视频中的标签有效的在线连续对象检测

Label-Efficient Online Continual Object Detection in Streaming Video

论文作者

Wu, Jay Zhangjie, Zhang, David Junhao, Hsu, Wynne, Zhang, Mengmi, Shou, Mike Zheng

论文摘要

人类可以观看一个连续的视频流，并毫不费力地通过最少的监督进行新知识的持续获取和转移，但保留了以前学习的经验。相比之下，现有的持续学习（CL）方法需要完全注释的标签才能有效地从视频流中的各个框架中学习。在这里，我们研究了流视频中一个更现实，更具挑战性的问题$ \ unicode {x2014} $标签效率的在线连续对象检测（Leocod）。我们提出了一个高效CLS的插件模块，可以轻松地将其插入并改善现有的持续学习者中，以在降低数据注释成本和模型重新培训时间的视频流中进行对象检测。我们表明，我们的方法取得了重大改进，在两个挑战性的CLENG基准中，在流媒体视频中的两个具有挑战性的CLENG MACKS上，所有监督级别的遗忘都最小。值得注意的是，只有25％的注释视频帧，我们的方法仍然胜过基本CL学习者，这些学习者在所有视频帧上接受了100％注释的培训。数据和源代码将在https://github.com/showlab/efficited-cls上公开获得。

Humans can watch a continuous video stream and effortlessly perform continual acquisition and transfer of new knowledge with minimal supervision yet retaining previously learnt experiences. In contrast, existing continual learning (CL) methods require fully annotated labels to effectively learn from individual frames in a video stream. Here, we examine a more realistic and challenging problem$\unicode{x2014}$Label-Efficient Online Continual Object Detection (LEOCOD) in streaming video. We propose a plug-and-play module, Efficient-CLS, that can be easily inserted into and improve existing continual learners for object detection in video streams with reduced data annotation costs and model retraining time. We show that our method has achieved significant improvement with minimal forgetting across all supervision levels on two challenging CL benchmarks for streaming real-world videos. Remarkably, with only 25% annotated video frames, our method still outperforms the base CL learners, which are trained with 100% annotations on all video frames. The data and source code will be publicly available at https://github.com/showlab/Efficient-CLS.

下载PDF全文

下载文献需遵守相关版权规定

论文标题