Clustering Sequence Data with Mixture Markov Chains with Covariates Using Multiple Simplex Constrained Optimization Routine (MSiCOR)

Priyam Das, Deborshee Sen, Debsurya De, Jue Hou, Zahra S.H. Abad, Nicole Kim, Zongqi Xia, Tianxi Cai

Research output: Contribution to journalArticlepeer-review

1 Scopus citations

Abstract

Mixture Markov Model (MMM) is a widely used tool to cluster sequences of events coming from a finite state-space. However, the MMM likelihood being multi-modal, the challenge remains in its maximization. Although Expectation-Maximization (EM) algorithm remains one of the most popular ways to estimate the MMM parameters, however, convergence of EM algorithm is not always guaranteed. Given the computational challenges in maximizing the mixture likelihood on the constrained parameter space, we develop a pattern search-based global optimization technique which can optimize any objective function on a collection of simplexes, which is eventually used to maximize MMM likelihood. This is shown to outperform other related global optimization techniques. In simulation experiments, the proposed method is shown to outperform the expectation-maximization (EM) algorithm in the context of MMM estimation performance. The proposed method is applied to cluster Multiple sclerosis (MS) patients based on their treatment sequences of disease-modifying therapies (DMTs). We also propose a novel method to cluster people with MS based on DMT prescriptions and associated clinical features (covariates) using MMM with covariates. Based on the analysis, we divided MS patients into three clusters. Further cluster-specific summaries of relevant covariates indicate patient differences among the clusters. Supplementary materials for this article are available online.

Original languageEnglish (US)
Pages (from-to)379-392
Number of pages14
JournalJournal of Computational and Graphical Statistics
Volume33
Issue number2
DOIs
StatePublished - 2024

Bibliographical note

Publisher Copyright:
© 2023 American Statistical Association and Institute of Mathematical Statistics.

Keywords

  • Disease-modifying therapy
  • Global optimization
  • Markov chain
  • Medical sequence data
  • Mixture model
  • Multiple sclerosis

PubMed: MeSH publication types

  • Journal Article

Fingerprint

Dive into the research topics of 'Clustering Sequence Data with Mixture Markov Chains with Covariates Using Multiple Simplex Constrained Optimization Routine (MSiCOR)'. Together they form a unique fingerprint.

Cite this