This paper describes SpatialHadoop; a full-fledged MapReduce framework with native support for spatial data. SpatialHadoop is a comprehensive extension to Hadoop that injects spatial data awareness in each Hadoop layer, namely, the language, storage, MapReduce, and operations layers. In the language layer, SpatialHadoop adds a simple and expressive high level language for spatial data types and operations. In the storage layer, SpatialHadoop adapts traditional spatial index structures, Grid, R-tree and R+-tree, to form a two-level spatial index. SpatialHadoop enriches the MapReduce layer by two new components, SpatialFileSplitter and SpatialRecordReader, for efficient and scalable spatial data processing. In the operations layer, SpatialHadoop is already equipped with a dozen of operations, including range query, kNN, and spatial join. Other spatial operations are also implemented following a similar approach. Extensive experiments on real system prototype and real datasets show that SpatialHadoop achieves orders of magnitude better performance than Hadoop for spatial data processing.
|Original language||English (US)|
|Title of host publication||2015 IEEE 31st International Conference on Data Engineering, ICDE 2015|
|Publisher||IEEE Computer Society|
|Number of pages||12|
|State||Published - May 26 2015|
|Event||2015 31st IEEE International Conference on Data Engineering, ICDE 2015 - Seoul, Korea, Republic of|
Duration: Apr 13 2015 → Apr 17 2015
|Name||Proceedings - International Conference on Data Engineering|
|Other||2015 31st IEEE International Conference on Data Engineering, ICDE 2015|
|Country/Territory||Korea, Republic of|
|Period||4/13/15 → 4/17/15|
Bibliographical notePublisher Copyright:
© 2015 IEEE.