论文标题
Floodnet:高分辨率的航空影像数据集用于洪水后的理解
FloodNet: A High Resolution Aerial Imagery Dataset for Post Flood Scene Understanding
论文作者
论文摘要
视觉场景理解是在任何计算机视觉系统中做出任何关键决策的核心任务。尽管流行的计算机视觉数据集(如CityScapes,MS-Coco),Pascal为多个任务(例如图像分类,细分,对象检测)提供了良好的基准测试,但这些数据集几乎不适合后灾害损害评估。另一方面,现有的自然灾害数据集主要包括卫星图像,其空间分辨率低且重访时期很高。因此,他们没有提供快速有效的损害评估任务的范围。无人机(UAV)可以在任何灾难中毫不费力地进入困难的地方,并收集上述计算机视觉任务所需的高分辨率图像。为了解决这些问题,我们提出了高分辨率的无人机图像,即洪水,在哈维飓风之后被捕获。该数据集证明了受影响地区的损失淹没了。这些图像是针对语义分割任务的像素标记的,并且为视觉问题回答而产生了问题。 Floodnet构成了几个挑战,包括检测被洪水淹没的道路和建筑物,并区分天然水和洪水。随着深度学习算法的发展,我们可以分析任何灾难的影响,这些灾难可以确切地了解受影响地区。在本文中,我们比较和对比我们的数据集上回答图像分类,语义分割和视觉问题的基线方法的性能。
Visual scene understanding is the core task in making any crucial decision in any computer vision system. Although popular computer vision datasets like Cityscapes, MS-COCO, PASCAL provide good benchmarks for several tasks (e.g. image classification, segmentation, object detection), these datasets are hardly suitable for post disaster damage assessments. On the other hand, existing natural disaster datasets include mainly satellite imagery which have low spatial resolution and a high revisit period. Therefore, they do not have a scope to provide quick and efficient damage assessment tasks. Unmanned Aerial Vehicle(UAV) can effortlessly access difficult places during any disaster and collect high resolution imagery that is required for aforementioned tasks of computer vision. To address these issues we present a high resolution UAV imagery, FloodNet, captured after the hurricane Harvey. This dataset demonstrates the post flooded damages of the affected areas. The images are labeled pixel-wise for semantic segmentation task and questions are produced for the task of visual question answering. FloodNet poses several challenges including detection of flooded roads and buildings and distinguishing between natural water and flooded water. With the advancement of deep learning algorithms, we can analyze the impact of any disaster which can make a precise understanding of the affected areas. In this paper, we compare and contrast the performances of baseline methods for image classification, semantic segmentation, and visual question answering on our dataset.