Synthetic Map Generation to Provide Unlimited Training Data for Historical Map Text Detection

Zekun Li, Runyu Guan, Qianmu Yu, Yao Yi Chiang, Craig A. Knoblock

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Scopus citations

Abstract

Many historical map sheets are publicly available for studies that require long-term historical geographic data. The cartographic design of these maps includes a combination of map symbols and text labels. Automatically reading text labels from map images could greatly speed up the map interpretation and helps generate rich metadata describing the map content. Many text detection algorithms have been proposed to locate text regions in map images automatically, but most of the algorithms are trained on out-of-domain datasets (e.g., scenic images). Training data determines the quality of machine learning models, and manually annotating text regions in map images is labor-extensive and time-consuming. On the other hand, existing geographic data sources, such as Open-StreetMap (OSM), contain machine-readable map layers, which allow us to separate out the text layer and obtain text label annotations easily. However, the cartographic styles between OSM map tiles and historical maps are significantly different. This paper proposes a method to automatically generate an unlimited amount of annotated historical map images for training text detection models. We use a style transfer model to convert contemporary map images into historical style and place text labels upon them. We show that the state-of-the-art text detection models (e.g., PSENet) can benefit from the synthetic historical maps and achieve significant improvement for historical map text detection.

Original languageEnglish (US)
Title of host publicationProceedings of the 4th ACM SIGSPATIAL International Workshop on AI for Geographic Knowledge Discovery, GeoAI 2021
EditorsDalton Lunga, Lexie Yang, Song Gao, Bruno Martins, Yingjie Hu, Xueqing Deng, Shawn Newsam
PublisherAssociation for Computing Machinery, Inc
Pages17-26
Number of pages10
ISBN (Electronic)9781450391207
DOIs
StatePublished - Nov 2 2021
Externally publishedYes
Event4th ACM SIGSPATIAL International Workshop on AI for Geographic Knowledge Discovery, GeoAI 2021 - Beijing, China
Duration: Nov 2 2021Nov 2 2021

Publication series

NameProceedings of the 4th ACM SIGSPATIAL International Workshop on AI for Geographic Knowledge Discovery, GeoAI 2021

Conference

Conference4th ACM SIGSPATIAL International Workshop on AI for Geographic Knowledge Discovery, GeoAI 2021
Country/TerritoryChina
CityBeijing
Period11/2/2111/2/21

Bibliographical note

Funding Information:
This material is based upon work supported in part by the National Science Foundation under Grant Nos. IIS 1564164 (to the University of Southern California) and IIS 1563933 (to the University of Colorado at Boulder), NVIDIA Corporation, the National Endowment for the Humanities under Award No. HC-278125-21, and the University of Minnesota, Computer Science & Engineering Faculty startup funds.

Publisher Copyright:
© 2021 ACM.

Keywords

  • datasets
  • historical maps
  • synthetic data generation
  • text detection

Fingerprint

Dive into the research topics of 'Synthetic Map Generation to Provide Unlimited Training Data for Historical Map Text Detection'. Together they form a unique fingerprint.

Cite this