Perplexity and proximity: Large language model perplexity complements semantic distance metrics for the detection of incoherent speech

  • Weizhe Xu
  • , Serguei Pakhomov
  • , Patrick Heagerty
  • , Eric Horvitz
  • , Ellen R. Bradley
  • , Josh Woolley
  • , Andrew Campbell
  • , Alex Cohen
  • , Dror Ben-Zeev
  • , Trevor Cohen

Research output: Contribution to journalArticlepeer-review

Abstract

Objective: Semantic coherence in speech is characterized by a logical, connected flow of ideas. A lack of coherence in speech may reflect disorganized thinking, a core feature of psychosis in schizophrenia spectrum disorders (SSDs). Developing tools that could help with automated assessment of semantic coherence in language could facilitate early detection of SSDs and improved monitoring of symptoms, enabling more timely intervention. Large language models (LLMs) have demonstrated strong capabilities on numerous language-centric tasks and have shown promise for analyzing semantic coherence due to the natural fit between their innate measures of language perplexity and the surprising turns that incoherent narrative often takes. This study aims to develop a novel representation and associated measure of semantic coherence using LLM-based perplexity metrics and to compare this measure with traditional vector distance-based coherence metrics. Method: We evaluated “bag” and “chain” models based on LLM perplexities as measures of semantic coherence. Regression models were trained using both single and paired combinations of perplexity- and proximity-based features to predict human ratings of semantic coherence using standardized instruments. Performance was evaluated on held-out examples from a training set of speeches from individuals experiencing psychotic symptoms and a test set of clinical interviews with patients diagnosed with SSDs, both with labels from human assessments of disorganized thinking severity. Results: The best performance was achieved using a combination of perplexity and proximity features, yielding a Spearman correlation with human ratings of 0.61 (vs. 0.56 with proximity features alone) on leave-one-out cross-validation in the training set, and 0.54 (vs. 0.52 with proximity features alone) on the test set. Conclusion: We developed novel methods for assessing semantic coherence using LLM perplexities and found them complementary to proximity-based methods. Combined, these methods showed improved performance across two datasets, highlighting LLM's potential in enhancing automated diagnosis and monitoring of SSDs.

Original languageEnglish (US)
Article number104899
JournalJournal of Biomedical Informatics
Volume170
DOIs
StatePublished - Oct 2025

Bibliographical note

Publisher Copyright:
© 2025 The Authors

PubMed: MeSH publication types

  • Journal Article

Fingerprint

Dive into the research topics of 'Perplexity and proximity: Large language model perplexity complements semantic distance metrics for the detection of incoherent speech'. Together they form a unique fingerprint.

Cite this