Survival Modelling for Data From Combined Cohorts: Opening the Door to Meta Survival Analyses and Survival Analysis Using Electronic Health Records

James H. McVittie, Ana F. Best, David B. Wolfson, David A. Stephens, Julian Wolfson, David L. Buckeridge, Shahinaz M. Gadalla

Research output: Contribution to journalArticlepeer-review

1 Scopus citations


Non-parametric estimation of the survival function using observed failure time data depends on the underlying data generating mechanism, including the ways in which the data may be censored and/or truncated. For data arising from a single source or collected from a single cohort, a wide range of estimators have been proposed and compared in the literature. Often, however, it may be possible, and indeed advantageous, to combine and then analyse survival data that have been collected under different study designs. We review non-parametric survival analysis for data obtained by combining the most common types of cohort. We have two main goals: (i) to clarify the differences in the model assumptions and (ii) to provide a single lens through which some of the proposed estimators may be viewed. Our discussion is relevant to the meta-analysis of survival data obtained from different types of study, and to the modern era of electronic health records.

Original languageEnglish (US)
Pages (from-to)72-87
Number of pages16
JournalInternational Statistical Review
Issue number1
StatePublished - Apr 2023

Bibliographical note

Funding Information:
The authors thank the reviewers for their careful reading of the manuscript which we believe had led to an improved article. James H. McVittie was supported by the Natural Sciences and Engineering Research Council of Canada (NSERC) PGSD‐3 award. The views presented in this article are those of the authors and should not be viewed as official opinions or positions of the National Cancer Institute, National Institutes of Health or US Department of Health and Human Services. Shahinaz M. Gadalla is supported by the intramural research programme of the National Cancer Institute, NIH. The myotonic dystrophy data are from the CPRD database October 2016 release, obtained from the UK Medicines and Healthcare Products Regulatory Agency, HES database (© 2016) and ONS database (© 2016) reused with the permission of the Health and Social Care Information Centre. All rights reserved. The interpretation and conclusions contained in this study are those of the authors alone.

Publisher Copyright:
© 2022 International Statistical Institute. This article has been contributed to by U.S. Government employees and their work is in the public domain in the USA.


  • Censoring
  • EM algorithm
  • incident cohort
  • length bias
  • prevalent cohort

PubMed: MeSH publication types

  • Journal Article


Dive into the research topics of 'Survival Modelling for Data From Combined Cohorts: Opening the Door to Meta Survival Analyses and Survival Analysis Using Electronic Health Records'. Together they form a unique fingerprint.

Cite this