Statistically-Robust Clustering Techniques for Mapping Spatial Hotspots: A Survey

Yiqun Xie, Shashi Shekhar, Yan Li

Research output: Contribution to journalReview articlepeer-review

Abstract

Mapping of spatial hotspots, i.e., regions with significantly higher rates of generating cases of certain events (e.g., disease or crime cases), is an important task in diverse societal domains, including public health, public safety, transportation, agriculture, environmental science, and so on. Clustering techniques required by these domains differ from traditional clustering methods due to the high economic and social costs of spurious results (e.g., false alarms of crime clusters). As a result, statistical rigor is needed explicitly to control the rate of spurious detections. To address this challenge, techniques for statistically-robust clustering (e.g., scan statistics) have been extensively studied by the data mining and statistics communities. In this survey, we present an up-to-date and detailed review of the models and algorithms developed by this field. We first present a general taxonomy for statistically-robust clustering, covering key steps of data and statistical modeling, region enumeration and maximization, and significance testing. We further discuss different paradigms and methods within each of the key steps. Finally, we highlight research gaps and potential future directions, which may serve as a stepping stone in generating new ideas and thoughts in this growing field and beyond.

Original languageEnglish (US)
Article number3487893
JournalACM Computing Surveys
Volume55
Issue number2
DOIs
StatePublished - Mar 2023

Bibliographical note

Funding Information:
This work is supported in part by the NSF under Grants No. 2105133, 2126474, 2126449, 1901099, 1737633, and 2040459, the USDOD under Grants HM0210-13-1-0005, ARPA-E under Grant No. DE-AR0000795, USDA under Grant No. 2017-51181-27222, NIH under Grant No. UL1 TR002494, KL2 TR002492 and TL1 TR002493, Google AI for Social Good, and the University of Maryland. Authors’ addresses: Y. Xie, University of Maryland, Center for Geospatial Information Science, 7251 Preinkert Dr., College Park, MD 20742; email: xie@umd.edu; S. Shekhar and Y. Li, University of Minnesota, Department of Computer Science, 200 Union Street SE, Minneapolis, MN 55455; emails: {shekhar, lixx4266}@umn.edu. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from permissions@acm.org. © 2022 Association for Computing Machinery. 0360-0300/2022/01-ART36 $15.00 https://doi.org/10.1145/3487893

Publisher Copyright:
© 2022 Association for Computing Machinery.

Keywords

  • clustering
  • Hotspot
  • mapping
  • scan statistics
  • statistical rigor

Fingerprint

Dive into the research topics of 'Statistically-Robust Clustering Techniques for Mapping Spatial Hotspots: A Survey'. Together they form a unique fingerprint.

Cite this