Converting geographic features (e.g., place names) in map images into a vector format is the first step for incorporating cartographic information into a geographic information system (GIS). With the advancement in computational power and algorithm design, map processing systems have been considerably improved over the last decade. However, the fundamental map processing techniques such as color image segmentation, (map) layer separation, and object recognition are sensitive to minor variations in graphical properties of the input image (e.g., scanning resolution). As a result, most map processing results would not meet user expectations if the user does not "properly" scan the map of interest, pre-process the map image (e.g., using compression or not), and train the processing system, accordingly. These issues could slow down the further advancement of map processing techniques as such unsuccessful attempts create a discouraged user community, and less sophisticated tools would be perceived as more viable solutions. Thus, it is important to understand what kinds of maps are suitable for automatic map processing and what types of results and process-related errors can be expected. In this paper, we shed light on these questions by using a typical map processing task, text recognition, to discuss a number of map instances that vary in suitability for automatic processing. We also present an extensive experiment on a diverse set of scanned historical maps to provide measures of baseline performance of a standard text recognition tool under varying map conditions (graphical quality) and text representations (that can vary even within the same map sheet). Our experimental results help the user understand what to expect when a fully or semi-automatic map processing system is used to process a scanned map with certain (varying) graphical properties and complexities in map content.
Bibliographical noteFunding Information:
This research is based upon work supported in part by the University of Southern California under the Undergraduate Research Associates Program (URAP). We thank Rashmina Ramachandran Menon for creating a part of the experiment ground truth and Ronald Yu for managing the Strabo GitHub repository.
© 2016 Elsevier Ltd.
- Accuracy assessment
- Digital map processing
- Geographic information system
- Optical character recognition
- Scanned maps
- Text recognition