Incorporating spatial context for post-OCR in map images

Min Namgung, Yao Yi Chiang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Scopus citations

Abstract

Extracting text from historical maps using Optical Character Recognition (OCR) engines often results in partially or incorrectly recognized words due to complex map content. Previous work utilizes lexical-based approaches with linguistic context or applies language models to correct OCR results for documents. However, these post-OCR methods cannot directly consider spatial relations of map text for correction. For example, "Mississippi"and "River"constitute the place phrase "Mississippi River"(linguistic relation), and near "highway", there are likely to exist intersected "road"to enter the "highway"(spatial relation). This paper presents a novel approach that exploits the spatial arrangement of map text using a contextual language model, BART [6] for post-processing of map text from OCR. The approach first structures word-level map text into sentences based on their spatial arrangement while preserving the spatial location of words constituting a place name and corrects imperfect OCR text using neighboring information. To train BART for capturing spatial relations in map text, we automatically generate large numbers of synthetic maps to fine-tune BART with location names and their spatial context. We conduct experiments on synthetic and real-world historical maps of various map styles and scales and show that the proposed method can achieve significant improvement over the commonly used lexical approach.

Original languageEnglish (US)
Title of host publicationProceedings of the 5th ACM SIGSPATIAL International Workshop on AI for Geographic Knowledge Discovery, GeoAI 2022
EditorsBruno Martins, Dalton Lunga, Song Gao, Shawn Newsam, Lexie Yang, Xueqing Deng, Gengchen Mai
PublisherAssociation for Computing Machinery, Inc
Pages14-17
Number of pages4
ISBN (Electronic)9781450395328
DOIs
StatePublished - Nov 1 2022
Event5th ACM SIGSPATIAL International Workshop on AI for Geographic Knowledge Discovery, GeoAI 2022 - Seattle, United States
Duration: Nov 1 2022 → …

Publication series

NameProceedings of the 5th ACM SIGSPATIAL International Workshop on AI for Geographic Knowledge Discovery, GeoAI 2022

Conference

Conference5th ACM SIGSPATIAL International Workshop on AI for Geographic Knowledge Discovery, GeoAI 2022
Country/TerritoryUnited States
CitySeattle
Period11/1/22 → …

Bibliographical note

Funding Information:
5 CONCLUSION This paper presented a novel approach that exploits the spatial locations of map text with BART for post-OCR on maps. The main contribution is a method for incorporating spatial context with BART by clustering the spatial locations of OCRed words for converting 2D map text into 1D pseudo sentences in both training and inferencing for post-OCR processing. Overall, our presented method improves F1 score significantly on the synthetic maps compared to the lexical approach by correcting and predicting even unidentified metadata. However, due to the various types of short words in the historical map, the proposed method would remove short, unseen words in post-OCR processing. In the future, we will further include the short words (e.g., abbreviations) as well as incorporate geographic word variations in the training data to be able to handle many types of variations of place names in map post-OCR processing. ACKNOWLEDGMENTS This material is based upon work supported in part by NVIDIA Corporation, the National Endowment for the Humanities under Award No. HC-278125-21 and Council Reference AH/V009400/1, and the University of Minnesota, Computer Science & Engineering Faculty startup funds. We thank Jina Kim and Yijun Lin in developing the synthetic maps.

Publisher Copyright:
© 2022 ACM.

Keywords

  • BART
  • information retrieval
  • neural networks
  • post-OCR processing

Fingerprint

Dive into the research topics of 'Incorporating spatial context for post-OCR in map images'. Together they form a unique fingerprint.

Cite this