On Model-Based Clustering of Directional Data with Heavy Tails

Yingying Zhang, Volodymyr Melnykov, Igor Melnykov

Research output: Contribution to journalArticlepeer-review

4 Scopus citations

Abstract

Directional statistics deals with data that can be naturally expressed in the form of vector directions. The von Mises-Fisher distribution is one of the most fundamental parametric models to describe directional data. Mixtures of von Mises-Fisher distributions represent a popular approach to handling heterogeneous populations. However, components of such models can be affected by the presence of mild outliers or cluster tails heavier than what can be accommodated by means of a von Mises-Fisher distribution. To relax these model limitations, a mixture of contaminated von Mises-Fisher distributions is proposed. The performance of the proposed methodology is tested on synthetic data and applied to text and genetics data. The obtained results demonstrate the importance of the proposed procedure and its superiority over the traditional mixture of von Mises-Fisher distributions in the presence of heavy tails.

Original languageEnglish (US)
Pages (from-to)527-551
Number of pages25
JournalJournal of Classification
Volume40
Issue number3
DOIs
StatePublished - Nov 2023
Externally publishedYes

Bibliographical note

Publisher Copyright:
© 2023, The Author(s) under exclusive licence to The Classification Society.

Keywords

  • Directional data
  • EM algorithm
  • Mixture model
  • Von Mises-Fisher distribution

Fingerprint

Dive into the research topics of 'On Model-Based Clustering of Directional Data with Heavy Tails'. Together they form a unique fingerprint.

Cite this