TY - GEN
T1 - Content-based ontology matching for GIS datasets
AU - Partyka, Jeffrey
AU - Alipanah, Neda
AU - Khan, Latifur
AU - Thuraisingham, Bhavani
AU - Shekhar, Shashi
PY - 2008
Y1 - 2008
N2 - The alignment of separate ontologies by matching related concepts continues to attract great attention within the database and artificial intelligence communities, especially since semantic heterogeneity across data sources remains a widespread and relevant problem. In particular, the Geographic Information System (GIS) domain presents unique forms of semantic heterogeneity that require a variety of matching approaches. Our approach considers content-based techniques for aligning GIS ontologies. We examine the associated instance data of the compared concepts and apply a content-matching strategy to measure similarity based on value types based on N-grams present in the data. We focus special attention on a method applying the concepts of mutual information and N-grams by developing 2 separate variations and testing them over GIS dataset including multi-jurisdictions. In order to align concepts, first we find the appropriate columns. For this, we will exploit mutual information between two columns based on the type distribution of their content. Intuitively, if two columns are semantically same, type distribution should be very similar. We justify the conceptual validity of our ontology alignment technique with a series of experimental results that demonstrate the efficacy and utility of our algorithms on a wide-variety of authentic GIS data.
AB - The alignment of separate ontologies by matching related concepts continues to attract great attention within the database and artificial intelligence communities, especially since semantic heterogeneity across data sources remains a widespread and relevant problem. In particular, the Geographic Information System (GIS) domain presents unique forms of semantic heterogeneity that require a variety of matching approaches. Our approach considers content-based techniques for aligning GIS ontologies. We examine the associated instance data of the compared concepts and apply a content-matching strategy to measure similarity based on value types based on N-grams present in the data. We focus special attention on a method applying the concepts of mutual information and N-grams by developing 2 separate variations and testing them over GIS dataset including multi-jurisdictions. In order to align concepts, first we find the appropriate columns. For this, we will exploit mutual information between two columns based on the type distribution of their content. Intuitively, if two columns are semantically same, type distribution should be very similar. We justify the conceptual validity of our ontology alignment technique with a series of experimental results that demonstrate the efficacy and utility of our algorithms on a wide-variety of authentic GIS data.
KW - Dataset
KW - Geographic information systems
KW - Ontology
KW - Ontology alignment
KW - Schema matching
UR - http://www.scopus.com/inward/record.url?scp=70449700362&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=70449700362&partnerID=8YFLogxK
U2 - 10.1145/1463434.1463496
DO - 10.1145/1463434.1463496
M3 - Conference contribution
AN - SCOPUS:70449700362
SN - 9781605583235
T3 - GIS: Proceedings of the ACM International Symposium on Advances in Geographic Information Systems
SP - 407
EP - 410
BT - Proceedings of the 16th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, ACM GIS 2008
T2 - 16th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, ACM GIS 2008
Y2 - 5 November 2008 through 7 November 2008
ER -