Feature bagging for outlier detection

Aleksandar Lazarevic, Vipin Kumar

Research output: Contribution to conferencePaper

249 Scopus citations

Abstract

Outlier detection has recently become an important problem in many industrial and financial applications. In this paper, a novel feature bagging approach for detecting outliers in very large, high dimensional and noisy databases is proposed. It combines results from multiple outlier detection algorithms that are applied using different set of features. Every outlier detection algorithm uses a small subset of features that are randomly selected from the original feature set. As a result, each outlier detector identifies different outliers, and thus assigns to all data records outlier scores that correspond to their probability of being outliers. The outlier scores computed by the individual outlier detection algorithms are then combined in order to find the better quality outliers. Experiments performed on several synthetic and real life data sets show that the proposed methods for combining outputs from multiple outlier detection algorithms provide non-trivial improvements over the base algorithm.

Original languageEnglish (US)
Pages157-166
Number of pages10
DOIs
StatePublished - Dec 1 2005
EventKDD-2005: 11th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - Chicago, IL, United States
Duration: Aug 21 2005Aug 24 2005

Other

OtherKDD-2005: 11th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
CountryUnited States
CityChicago, IL
Period8/21/058/24/05

    Fingerprint

Keywords

  • Bagging
  • Detection rate
  • False alarm
  • Feature subsets
  • Integration
  • Outlier detection

Cite this

Lazarevic, A., & Kumar, V. (2005). Feature bagging for outlier detection. 157-166. Paper presented at KDD-2005: 11th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Chicago, IL, United States. https://doi.org/10.1145/1081870.1081891