TY - GEN
T1 - Should SDBMS support a join index?
T2 - 16th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, ACM GIS 2008
AU - Mohan, Pradeep
AU - Shekhar, Shashi
AU - Levine, Ned
AU - Wilson, Ronald E.
AU - George, Betsy
AU - Celik, Mete
PY - 2008
Y1 - 2008
N2 - Given a spatial crime data warehouse, that is updated infrequently and a set of operations O as well as constraints of storage and update overheads, the index type selection problem is to find a set of index types that can reduce the I/O cost of the set of operations. The index type selection problem is important to improve user experience and system resource utilization in crucial spatial statistics application domains such as mapping and analysis for public safety, public health, ecology, and transportation. This is because the response time of frequent queries based on the set of operations can be improved significantly by an effective choice of index types. Many spatial statistical queries in these application domains make use of a spatial neighborhood matrix, known as W in spatial statistics, which can be thought of as a spatial self-join in spatial database terminology. Currently supported index types such as B-Tree and R-Tree families do not adequately support spatial statistical analysis because they require on-the-fly computation of the W-Matrix, slowing down spatial statistical analysis. In contrast, this paper argues that Spatial Database Management Systems (SDBMS) should support a join index to materialize the W-Matrix and eliminate on-the-fly computation of the common self-join. A detailed case study using the popular spatial statistical software package for public safety, namely CrimeStat, shows that join indices can significantly speed up spatial analysis such as calculation of Ripley's K and identification of hotspots.
AB - Given a spatial crime data warehouse, that is updated infrequently and a set of operations O as well as constraints of storage and update overheads, the index type selection problem is to find a set of index types that can reduce the I/O cost of the set of operations. The index type selection problem is important to improve user experience and system resource utilization in crucial spatial statistics application domains such as mapping and analysis for public safety, public health, ecology, and transportation. This is because the response time of frequent queries based on the set of operations can be improved significantly by an effective choice of index types. Many spatial statistical queries in these application domains make use of a spatial neighborhood matrix, known as W in spatial statistics, which can be thought of as a spatial self-join in spatial database terminology. Currently supported index types such as B-Tree and R-Tree families do not adequately support spatial statistical analysis because they require on-the-fly computation of the W-Matrix, slowing down spatial statistical analysis. In contrast, this paper argues that Spatial Database Management Systems (SDBMS) should support a join index to materialize the W-Matrix and eliminate on-the-fly computation of the common self-join. A detailed case study using the popular spatial statistical software package for public safety, namely CrimeStat, shows that join indices can significantly speed up spatial analysis such as calculation of Ripley's K and identification of hotspots.
KW - Join index
KW - Self-join
KW - Spatial statistics
KW - W matrix
UR - http://www.scopus.com/inward/record.url?scp=70449700375&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=70449700375&partnerID=8YFLogxK
U2 - 10.1145/1463434.1463481
DO - 10.1145/1463434.1463481
M3 - Conference contribution
AN - SCOPUS:70449700375
SN - 9781605583235
T3 - GIS: Proceedings of the ACM International Symposium on Advances in Geographic Information Systems
SP - 327
EP - 336
BT - Proceedings of the 16th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, ACM GIS 2008
Y2 - 5 November 2008 through 7 November 2008
ER -