论文标题

探索用于不实体对象检测的时空聚合:基准数据集和基线

Explore Spatio-temporal Aggregation for Insubstantial Object Detection: Benchmark Dataset and Baseline

论文作者

Zhou, Kailai, Wang, Yibo, Lv, Tao, Li, Yunqian, Chen, Linsen, Shen, Qiu, Cao, Xun

论文摘要

我们努力努力探索的任务很少被称为Insubstantial对象检测(IOD),该任务旨在以以下特征来定位对象:(1)具有模糊边界的无定形形状; (2)与周围环境相似; (3)颜色不存在。因此,在单个静态框架中区分不变对象是更具挑战性的,而空间和时间信息的协作表示至关重要。因此,我们构建了一个由600个视频(141,017帧)组成的iod-video数据集,其中涵盖了不同频谱范围捕获的各种距离,尺寸,可见性和场景。此外,我们为IOD开发了一个时空聚合框架,其中部署了不同的骨架,并精心设计了时空聚集损失(Staloss),以利用沿时间轴的一致性来利用一致性。在IOD-VIDEO数据集上进行的实验表明,时空聚集可以显着改善IOD的性能。我们希望我们的工作能够吸引进一步的研究,以完成这项有价值但充满挑战的任务。该代码将在:\ url {https://github.com/calayzhou/iod-video}上可用。

We endeavor on a rarely explored task named Insubstantial Object Detection (IOD), which aims to localize the object with following characteristics: (1) amorphous shape with indistinct boundary; (2) similarity to surroundings; (3) absence in color. Accordingly, it is far more challenging to distinguish insubstantial objects in a single static frame and the collaborative representation of spatial and temporal information is crucial. Thus, we construct an IOD-Video dataset comprised of 600 videos (141,017 frames) covering various distances, sizes, visibility, and scenes captured by different spectral ranges. In addition, we develop a spatio-temporal aggregation framework for IOD, in which different backbones are deployed and a spatio-temporal aggregation loss (STAloss) is elaborately designed to leverage the consistency along the time axis. Experiments conducted on IOD-Video dataset demonstrate that spatio-temporal aggregation can significantly improve the performance of IOD. We hope our work will attract further researches into this valuable yet challenging task. The code will be available at: \url{https://github.com/CalayZhou/IOD-Video}.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源