论文标题
检查众包报告中的数据不平衡,以提高山洪情景意识
Examining Data Imbalance in Crowdsourced Reports for Improving Flash Flood Situational Awareness
论文作者
论文摘要
众包数据的使用一直在寻找实用的用途,以增强灾难期间的情境意识。尽管最近的研究表明,关于众包数据对洪水映射的潜力的有希望的结果,但很少关注可能引入偏见的数据失衡问题。我们检查了众包报告中存在的偏见,以确定数据失衡,以提高灾难情况意识。当我们分析报告的报告洪水3-1-1,Waze报告和FEMA损害数据中,在2019年的热带风暴伊梅尔达(Imelda)和2021年的飓风IDA中收集的样本偏差,空间偏见和人口偏见,并在3-1中收集的FEMA损害数据。将3-1-1的其他相关主题纳入了全球Moran Moran In emelligation和Spate nistial Spate apatial obagation ausplate of Spate ablesapation the Spate ablesapation the Spate abluse的其他相关主题。为了检查空间偏差,我们在人口普查区和人口普查区块级别的三个数据集上执行LISA和BI-LISA测试。通过查看两个地理聚集,我们发现较大的空间聚集,人口普查区域显示结果中的数据不平衡较少。最后,对BI-LISA产生的群集进行方差(ANOVA)测试的单向方差分析表明,数据不平衡存在于少数群体居住的地区。通过回归分析,我们发现3-1-1和Waze报告在少数群体居住的地区存在数据不平衡限制。这项研究的发现提高了对众包数据集中数据失衡和偏见的理解,这些数据集已被广泛用于灾难情况意识。
The use of crowdsourced data has been finding practical use for enhancing situational awareness during disasters. While recent studies have shown promising results regarding the potential of crowdsourced data for flood mapping, little attention has been paid to data imbalances issues that could introduce biases. We examine biases present in crowdsourced reports to identify data imbalances with a goal of improving disaster situational awareness. Sample bias, spatial bias, and demographic bias are examined as we analyzed reported flooding from 3-1-1, Waze reports, and FEMA damage data collected in the aftermaths of Tropical Storm Imelda in 2019 and Hurricane Ida in 2021. Integrating other flooding related topics from 3-1-1 reports into the Global Moran's I and Local Indicator of Spatial Association (LISA) test revealed more communities that were impacted by floods. To examine spatial bias, we perform the LISA and BI-LISA tests on the three datasets at the census tract and census block group level. By looking at two geographical aggregations, we found that the larger spatial aggregations, census tracts, show less data imbalance in the results. Finally, one-way analysis of Variance (ANOVA) test performed on the clusters generated from the BI-LISA shows that data imbalance exists in areas where minority populations reside. Through a regression analysis, we found that 3-1-1 and Waze reports have data imbalance limitations in areas where minority populations reside. The findings of this study advance understanding of data imbalances and biases in crowdsourced datasets that are growingly used for disaster situational awareness.