Information extraction from historical maps represents a persistent challenge due to inferior graphical quality and the large data volume of digital map archives, which can hold thousands of digitized map sheets. Traditional map processing techniques typically rely on manually collected templates of the symbol of interest, and thus are not suitable for large-scale information extraction. In order to digitally preserve such large amounts of valuable retrospective geographic information, high levels of automation are required. Herein, we propose an automated machine-learning based framework to extract human settlement symbols, such as buildings and urban areas from historical topographic maps in the absence of training data, employing contemporary geospatial data as ancillary data to guide the collection of training samples. These samples are then used to train a convolutional neural network for semantic image segmentation, allowing for the extraction of human settlement patterns in an analysis-ready geospatial vector data format. We test our method on United States Geological Survey historical topographic maps published between 1893 and 1954. The results are promising, indicating high degrees of completeness in the extracted settlement features (i.e., recall of up to 0.96, F-measure of up to 0.79) and will guide the next steps to provide a fully automated operational approach for large-scale geographic feature extraction from a variety of historical map series. Moreover, the proposed framework provides a robust approach for the recognition of objects which are small in size, generalizable to many kinds of visual documents.
Bibliographical noteFunding Information:
This work was supported in part by the National Science Foundation through the University of Colorado Boulder under Grant IIS 1563933, in part by the National Science Foundation through the University of Southern California under Grant IIS 1564164.
© 2013 IEEE.
- Convolutional neural networks
- digital humanities
- digital preservation
- document analysis
- geospatial analysis
- geospatial artificial intelligence
- human settlement patterns
- image analysis
- weakly supervised learning