论文标题
带有多层地理编码的空间语言表示
Spatial Language Representation with Multi-Level Geocoding
论文作者
论文摘要
我们提出了一个多级地理编码模型(MLG),该模型学会将文本与地理位置关联。地球的表面是使用填充空间曲线将球分解为类似大小的非重叠细胞层次结构的。 MLG通过将多个级别的损失结合并同时预测细胞来平衡概括和准确性。在不使用任何数据集特定的调整的情况下,我们表明MLG在三个英语数据集上获得了最高分辨率的最新结果。此外,它在没有任何知识库元数据的情况下获得了很大的收益,表明它可以有效地学习文本跨度和坐标之间的联系 - 因此可以扩展到知识库中不存在的替代品。
We present a multi-level geocoding model (MLG) that learns to associate texts to geographic locations. The Earth's surface is represented using space-filling curves that decompose the sphere into a hierarchy of similarly sized, non-overlapping cells. MLG balances generalization and accuracy by combining losses across multiple levels and predicting cells at each level simultaneously. Without using any dataset-specific tuning, we show that MLG obtains state-of-the-art results for toponym resolution on three English datasets. Furthermore, it obtains large gains without any knowledge base metadata, demonstrating that it can effectively learn the connection between text spans and coordinates - and thus can be extended to toponymns not present in knowledge bases.