Improved feasible solution algorithms for high breakdown estimation

Douglas M Hawkins, David J. Olive

Research output: Contribution to journalArticle

52 Scopus citations

Abstract

High breakdown estimation allows one to get reasonable estimates of the parameters from a sample of data even if that sample is contaminated by large numbers of awkwardly placed outliers. Two particular application areas in which this is of interest are multiple linear regression, and estimation of the location vector and scatter matrix of multivariate data. Standard high breakdown criteria for the regression problem are the least median of squares (LMS) and least trimmed squares (LTS); those for the multivariate location/scatter problem are the minimum volume ellipsoid (MVE) and minimum covariance determinant (MCD). All of these present daunting computational problems. The 'feasible solution algorithms' for these criteria have been shown to have excellent performance for text-book sized problems, but their performance on much larger data sets is less impressive. This paper points out a computationally cheaper feasibility condition for LTS, MVE and MCD, and shows how the combination of the criteria leads to improved performance on large data sets. Algorithms incorporating these improvements are available from the first author's Web site.

Original languageEnglish (US)
Pages (from-to)1-11
Number of pages11
JournalComputational Statistics and Data Analysis
Volume30
Issue number1
DOIs
StatePublished - Mar 28 1999

Keywords

  • High breakdown estimation
  • Least trimmed squares
  • Linear model
  • Minimum covariance determinant
  • Minimum volume ellipsoid
  • Outliers

Fingerprint Dive into the research topics of 'Improved feasible solution algorithms for high breakdown estimation'. Together they form a unique fingerprint.

  • Cite this