Gaussian process subset scanning for anomalous pattern detection in non-iid data

William Herlands, Edward McFowland, Andrew G. Wilson, Daniel B. Neill

Research output: Contribution to conferencePaper

Abstract

Identifying anomalous patterns in real-world data is essential for understanding where, when, and how systems deviate from their expected dynamics. Yet methods that separately consider the anomalousness of each individual data point have low detection power for subtle, emerging irregularities. Additionally, recent detection techniques based on subset scanning make strong independence assumptions and suffer degraded performance in correlated data. We introduce methods for identifying anomalous patterns in non-iid data by combining Gaussian processes with novel log-likelihood ratio statistic and subset scanning techniques. Our approaches are powerful, interpretable, and can integrate information across multiple data streams. We illustrate their performance on numeric simulations and three open source spatiotemporal datasets of opioid overdose deaths, 311 calls, and storm reports.

Original languageEnglish (US)
Pages425-434
Number of pages10
StatePublished - Jan 1 2018
Event21st International Conference on Artificial Intelligence and Statistics, AISTATS 2018 - Playa Blanca, Lanzarote, Canary Islands, Spain
Duration: Apr 9 2018Apr 11 2018

Conference

Conference21st International Conference on Artificial Intelligence and Statistics, AISTATS 2018
CountrySpain
CityPlaya Blanca, Lanzarote, Canary Islands
Period4/9/184/11/18

Fingerprint

Gaussian Process
Anomalous
Scanning
Subset
Set theory
Opioids
Correlated Data
Log-likelihood Ratio
Likelihood Ratio Statistic
Statistics
Irregularity
Data Streams
Numerics
Open Source
Integrate
Simulation

Cite this

Herlands, W., McFowland, E., Wilson, A. G., & Neill, D. B. (2018). Gaussian process subset scanning for anomalous pattern detection in non-iid data. 425-434. Paper presented at 21st International Conference on Artificial Intelligence and Statistics, AISTATS 2018, Playa Blanca, Lanzarote, Canary Islands, Spain.

Gaussian process subset scanning for anomalous pattern detection in non-iid data. / Herlands, William; McFowland, Edward; Wilson, Andrew G.; Neill, Daniel B.

2018. 425-434 Paper presented at 21st International Conference on Artificial Intelligence and Statistics, AISTATS 2018, Playa Blanca, Lanzarote, Canary Islands, Spain.

Research output: Contribution to conferencePaper

Herlands, W, McFowland, E, Wilson, AG & Neill, DB 2018, 'Gaussian process subset scanning for anomalous pattern detection in non-iid data' Paper presented at 21st International Conference on Artificial Intelligence and Statistics, AISTATS 2018, Playa Blanca, Lanzarote, Canary Islands, Spain, 4/9/18 - 4/11/18, pp. 425-434.
Herlands W, McFowland E, Wilson AG, Neill DB. Gaussian process subset scanning for anomalous pattern detection in non-iid data. 2018. Paper presented at 21st International Conference on Artificial Intelligence and Statistics, AISTATS 2018, Playa Blanca, Lanzarote, Canary Islands, Spain.
Herlands, William ; McFowland, Edward ; Wilson, Andrew G. ; Neill, Daniel B. / Gaussian process subset scanning for anomalous pattern detection in non-iid data. Paper presented at 21st International Conference on Artificial Intelligence and Statistics, AISTATS 2018, Playa Blanca, Lanzarote, Canary Islands, Spain.10 p.
@conference{b075451bbb0b4a928440ffa7e70b3518,
title = "Gaussian process subset scanning for anomalous pattern detection in non-iid data",
abstract = "Identifying anomalous patterns in real-world data is essential for understanding where, when, and how systems deviate from their expected dynamics. Yet methods that separately consider the anomalousness of each individual data point have low detection power for subtle, emerging irregularities. Additionally, recent detection techniques based on subset scanning make strong independence assumptions and suffer degraded performance in correlated data. We introduce methods for identifying anomalous patterns in non-iid data by combining Gaussian processes with novel log-likelihood ratio statistic and subset scanning techniques. Our approaches are powerful, interpretable, and can integrate information across multiple data streams. We illustrate their performance on numeric simulations and three open source spatiotemporal datasets of opioid overdose deaths, 311 calls, and storm reports.",
author = "William Herlands and Edward McFowland and Wilson, {Andrew G.} and Neill, {Daniel B.}",
year = "2018",
month = "1",
day = "1",
language = "English (US)",
pages = "425--434",
note = "21st International Conference on Artificial Intelligence and Statistics, AISTATS 2018 ; Conference date: 09-04-2018 Through 11-04-2018",

}

TY - CONF

T1 - Gaussian process subset scanning for anomalous pattern detection in non-iid data

AU - Herlands, William

AU - McFowland, Edward

AU - Wilson, Andrew G.

AU - Neill, Daniel B.

PY - 2018/1/1

Y1 - 2018/1/1

N2 - Identifying anomalous patterns in real-world data is essential for understanding where, when, and how systems deviate from their expected dynamics. Yet methods that separately consider the anomalousness of each individual data point have low detection power for subtle, emerging irregularities. Additionally, recent detection techniques based on subset scanning make strong independence assumptions and suffer degraded performance in correlated data. We introduce methods for identifying anomalous patterns in non-iid data by combining Gaussian processes with novel log-likelihood ratio statistic and subset scanning techniques. Our approaches are powerful, interpretable, and can integrate information across multiple data streams. We illustrate their performance on numeric simulations and three open source spatiotemporal datasets of opioid overdose deaths, 311 calls, and storm reports.

AB - Identifying anomalous patterns in real-world data is essential for understanding where, when, and how systems deviate from their expected dynamics. Yet methods that separately consider the anomalousness of each individual data point have low detection power for subtle, emerging irregularities. Additionally, recent detection techniques based on subset scanning make strong independence assumptions and suffer degraded performance in correlated data. We introduce methods for identifying anomalous patterns in non-iid data by combining Gaussian processes with novel log-likelihood ratio statistic and subset scanning techniques. Our approaches are powerful, interpretable, and can integrate information across multiple data streams. We illustrate their performance on numeric simulations and three open source spatiotemporal datasets of opioid overdose deaths, 311 calls, and storm reports.

UR - http://www.scopus.com/inward/record.url?scp=85067798849&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85067798849&partnerID=8YFLogxK

M3 - Paper

AN - SCOPUS:85067798849

SP - 425

EP - 434

ER -