论文标题
迈向多边形几何形状的通用表示
Towards General-Purpose Representation Learning of Polygonal Geometries
论文作者
论文摘要
空间数据的神经网络表示学习是对地理人工智能(GEOAI)问题的普遍需求。近年来,在表示要点,多数和网络的表示学习方面已取得了许多进步,而多边形,尤其是复杂的多边形几何形状的进展很少。在这项工作中,我们专注于开发一个通用多边形编码模型,该模型可以编码多边形几何形状(有或不带有孔,单个或多重法)中的嵌入式空间。可以将嵌入的结果直接(或填充)用于下游任务,例如形状分类,空间关系预测等。为了获得模型的通用性保证,我们确定了一些理想的属性:循环起源不变性,微不足道的顶点不变性,零件置换不变性和拓扑意识。我们探索编码器的两个不同设计:一个人在空间域中衍生所有表示。其他利用光谱域表示。对于空间域方法,我们提出了一个基于1D CNN的多边形编码器Resnet1d,它使用圆形填充来实现对简单多边形的环路起源不变性。对于光谱域方法,我们基于非均匀的傅立叶变换(NUFT)开发NUFTSPEC,该转化自然满足了所有所需的特性。我们对两个任务进行实验:1)基于MNIST的形状分类; 2)基于两个新数据集的空间关系预测-DBSR-46K和DBSR-CPLX46K。我们的结果表明,NUFTSPEC和RESNET1D的表现优于多个现有基线,并具有明显的利润。尽管Resnet1d在形状不变的几何修饰后遭受模型性能降解,但由于NUFT的性质,NuftSpec对这些修饰非常健壮。
Neural network representation learning for spatial data is a common need for geographic artificial intelligence (GeoAI) problems. In recent years, many advancements have been made in representation learning for points, polylines, and networks, whereas little progress has been made for polygons, especially complex polygonal geometries. In this work, we focus on developing a general-purpose polygon encoding model, which can encode a polygonal geometry (with or without holes, single or multipolygons) into an embedding space. The result embeddings can be leveraged directly (or finetuned) for downstream tasks such as shape classification, spatial relation prediction, and so on. To achieve model generalizability guarantees, we identify a few desirable properties: loop origin invariance, trivial vertex invariance, part permutation invariance, and topology awareness. We explore two different designs for the encoder: one derives all representations in the spatial domain; the other leverages spectral domain representations. For the spatial domain approach, we propose ResNet1D, a 1D CNN-based polygon encoder, which uses circular padding to achieve loop origin invariance on simple polygons. For the spectral domain approach, we develop NUFTspec based on Non-Uniform Fourier Transformation (NUFT), which naturally satisfies all the desired properties. We conduct experiments on two tasks: 1) shape classification based on MNIST; 2) spatial relation prediction based on two new datasets - DBSR-46K and DBSR-cplx46K. Our results show that NUFTspec and ResNet1D outperform multiple existing baselines with significant margins. While ResNet1D suffers from model performance degradation after shape-invariance geometry modifications, NUFTspec is very robust to these modifications due to the nature of the NUFT.