Generating pregnant patient biological profiles by deconvoluting clinical records with electronic health record foundation models

David Seong, Samson Mataraso, Camilo Espinosa, Eloise Berson, S. Momsen Reincke, Lei Xue, Chloe Kashiwagi, Yeasul Kim, Chi Hung Shu, Philip Chung, Marc Ghanem, Feng Xie, Ronald J. Wong, Martin S. Angst, Brice Gaudilliere, Gary M. Shaw, David K. Stevenson, Nima Aghaeepour

Research output: Contribution to journalArticlepeer-review

2 Scopus citations

Abstract

Translational biology posits a strong bi-directional link between clinical phenotypes and a patient's biological profile. By leveraging this bi-directional link, we can efficiently deconvolute pre-existing clinical information into biological profiles. However, traditional computational tools are limited in their ability to resolve this link because of the relatively small sizes of paired clinical-biological datasets for training and the high dimensionality/sparsity of tabular clinical data. Here, we use state-of-the-art foundation models (FMs) for electronic health record (EHR) data to generate proteomics profiles of pregnant patients, thereby deconvoluting pre-existing clinical information into biological profiles without the cost and effort of running large-scale traditional omics studies. We show that FM-derived representations of a patient's EHR data coupled with a fully connected neural network prediction head can generate 206 blood protein expression levels. Interestingly, these proteins were enriched for developmental pathways, while proteins not able to be generated from EHR data were enriched for metabolic pathways. Finally, we show a proteomic signature of gestational diabetes that includes proteins with established and novel links to gestational diabetes. These results showcase the power of FM-derived EHR representations in efficiently generating biological states of pregnant patients. This capability can revolutionize disease understanding and therapeutic development, offering a cost-effective, time-efficient, and less invasive alternative to traditional methods of generating proteomics.

Original languageEnglish (US)
Article numberbbae574
JournalBriefings in Bioinformatics
Volume25
Issue number6
DOIs
StatePublished - Nov 1 2024
Externally publishedYes

Bibliographical note

Publisher Copyright:
© 2024 The Author(s). Published by Oxford University Press.

Keywords

  • electronic health record
  • foundation model
  • machine learning
  • pregnancy
  • proteomics

PubMed: MeSH publication types

  • Journal Article

Fingerprint

Dive into the research topics of 'Generating pregnant patient biological profiles by deconvoluting clinical records with electronic health record foundation models'. Together they form a unique fingerprint.

Cite this