Wadjet: Finding outliers in multiple multi-dimensional heterogeneous data streams

Shiblee Sadik, Le Gruenwald, Eleazar Leal

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Scopus citations

Abstract

Data streams are sequences of data points that have the properties of transiency, infiniteness, concept drift, uncertainty, multi-dimensionality, cross-correlation among different streams, asynchronous arrival, and heterogeneity. In this paper we propose a new outlier detection technique for multiple multi-dimensional data streams, called Wadjet, that addresses all the issues of outlier detection in multiple data streams. Wadjet exploits the temporal correlations to identify outliers in each individual data stream, and after this, it exploits the cross-correlations between data streams to identify points that do not conform with these cross-correlations. Experiments comparing Wadjet against existing techniques on real and synthetic datasets show that Wadjet achieves 18.8X higher precision, and competitive execution time and recall.

Original languageEnglish (US)
Title of host publicationProceedings - IEEE 34th International Conference on Data Engineering, ICDE 2018
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages1236-1239
Number of pages4
ISBN (Electronic)9781538655207
DOIs
StatePublished - Oct 24 2018
Event34th IEEE International Conference on Data Engineering, ICDE 2018 - Paris, France
Duration: Apr 16 2018Apr 19 2018

Publication series

NameProceedings - IEEE 34th International Conference on Data Engineering, ICDE 2018

Other

Other34th IEEE International Conference on Data Engineering, ICDE 2018
Country/TerritoryFrance
CityParis
Period4/16/184/19/18

Keywords

  • Data streams
  • Heterogeneous data streams
  • Outlier detection
  • Uncertain data streams

Fingerprint

Dive into the research topics of 'Wadjet: Finding outliers in multiple multi-dimensional heterogeneous data streams'. Together they form a unique fingerprint.

Cite this