论文标题

通过后处理数值预测,用于预测降水的基准数据集预测

Benchmark Dataset for Precipitation Forecasting by Post-Processing the Numerical Weather Prediction

论文作者

Kim, Taehyeon, Ho, Namgyu, Kim, Donggyu, Yun, Se-Young

论文摘要

降水预测是一项重要的科学挑战,对社会产生广泛影响。从历史上看,这项挑战是使用数值天气预测(NWP)模型解决的,该模型基于物理基于物理的模拟。最近,许多作品使用端到端深度学习(DL)模型提出了一种替代方法来替代基于物理的NWP模型。尽管这些DL方法显示出提高的性能和计算效率,但它们在长期预测中表现出局限性,并且缺乏解释性。在这项工作中,我们提出了一个混合NWP-DL工作流程,以填补独立NWP和DL方法之间的空白。在此工作流程下,NWP模型的输出被馈入深度神经网络,该网络后处理数据以产生精致的降水预测。使用自动气象站(AWS)作为地面真相标签对深度模型进行了监督训练。这可以两全其美,甚至可以从NWP技术的未来改进中受益。为了促进朝这个方向进行研究,我们提出了一个针对朝鲜半岛的新型数据集,该数据集称为KOMET(KOMEN(KOMELACY DATASET),由NWP输出和AWS观测值组成。对于NWP模型,使用了全局数据同化和预测系统-KOREA集成模型(GDAPS-KIM)。我们对旨在应对KOMEM挑战的一组综合基线方法进行分析,包括AWS观察和阶级失衡的稀疏性。为了降低进入障碍并鼓励进一步的研究,我们还提供了广泛的开源Python软件包,用于数据处理和模型开发。我们的基准数据和代码可在https://github.com/osilab-kaist/komet-benchmark-dataset上获得。

Precipitation forecasting is an important scientific challenge that has wide-reaching impacts on society. Historically, this challenge has been tackled using numerical weather prediction (NWP) models, grounded on physics-based simulations. Recently, many works have proposed an alternative approach, using end-to-end deep learning (DL) models to replace physics-based NWP models. While these DL methods show improved performance and computational efficiency, they exhibit limitations in long-term forecasting and lack the explainability. In this work, we present a hybrid NWP-DL workflow to fill the gap between standalone NWP and DL approaches. Under this workflow, the outputs of NWP models are fed into a deep neural network, which post-processes the data to yield a refined precipitation forecast. The deep model is trained with supervision, using Automatic Weather Station (AWS) observations as ground-truth labels. This can achieve the best of both worlds, and can even benefit from future improvements in NWP technology. To facilitate study in this direction, we present a novel dataset focused on the Korean Peninsula, termed KoMet (Korea Meteorological Dataset), comprised of NWP outputs and AWS observations. For the NWP model, the Global Data Assimilation and Prediction Systems-Korea Integrated Model (GDAPS-KIM) is utilized. We provide analysis on a comprehensive set of baseline methods aimed at addressing the challenges of KoMet, including the sparsity of AWS observations and class imbalance. To lower the barrier to entry and encourage further study, we also provide an extensive open-source Python package for data processing and model development. Our benchmark data and code are available at https://github.com/osilab-kaist/KoMet-Benchmark-Dataset.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源