GeoLM: Empowering Language Models for Geospatially Grounded Language Understanding

Zekun Li, Wenxuan Zhou, Yao Yi Chiang, Muhao Chen

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Scopus citations

Abstract

Humans subconsciously engage in geospatial reasoning when reading articles. We recognize place names and their spatial relations in text and mentally associate them with their physical locations on Earth. Although pretrained language models can mimic this cognitive process using linguistic context, they do not utilize valuable geospatial information in large, widely available geographical databases, e.g., OpenStreetMap. This paper introduces GEOLM, a geospatially grounded language model that enhances the understanding of geo-entities in natural language. GEOLM leverages geo-entity mentions as anchors to connect linguistic information in text corpora with geospatial information extracted from geographical databases. GEOLM connects the two types of context through contrastive learning and masked language modeling. It also incorporates a spatial coordinate embedding mechanism to encode distance and direction relations to capture geospatial context. In the experiment, we demonstrate that GEOLM exhibits promising capabilities in supporting toponym recognition, toponym linking, relation extraction, and geo-entity typing, which bridge the gap between natural language processing and geospatial sciences. The code is publicly available at https://github.com/knowledge-computing/geolm.

Original languageEnglish (US)
Title of host publicationEMNLP 2023 - 2023 Conference on Empirical Methods in Natural Language Processing, Proceedings
EditorsHouda Bouamor, Juan Pino, Kalika Bali
PublisherAssociation for Computational Linguistics (ACL)
Pages5227-5240
Number of pages14
ISBN (Electronic)9798891760608
StatePublished - 2023
Event2023 Conference on Empirical Methods in Natural Language Processing, EMNLP 2023 - Hybrid, Singapore, Singapore
Duration: Dec 6 2023Dec 10 2023

Publication series

NameEMNLP 2023 - 2023 Conference on Empirical Methods in Natural Language Processing, Proceedings

Conference

Conference2023 Conference on Empirical Methods in Natural Language Processing, EMNLP 2023
Country/TerritorySingapore
CityHybrid, Singapore
Period12/6/2312/10/23

Bibliographical note

Publisher Copyright:
© 2023 Association for Computational Linguistics.

Fingerprint

Dive into the research topics of 'GeoLM: Empowering Language Models for Geospatially Grounded Language Understanding'. Together they form a unique fingerprint.

Cite this