An intrusive method for estimating speech intelligibility from noisy and distorted signals

Nursadul Mamun, Muhammad S.A. Zilany, John H.L. Hansen, Evelyn E. Davies-Venn

Research output: Contribution to journalArticlepeer-review

Abstract

An objective metric that predicts speech intelligibility under different types of noise and distortion would be desirable in voice communication. To date, the majority of studies concerning speech intelligibility metrics have focused on predicting the effects of individual noise or distortion mechanisms. This study proposes an objective metric, the spectrogram orthogonal polynomial measure (SOPM), that attempts to predict speech intelligibility for people with normal hearing under adverse conditions. The SOPM metric is developed by extracting features from the spectrogram using Krawtchouk moments. The metric's performance is evaluated for several types of noise (steady-state and fluctuating noise), distortions (peak clipping, center clipping, and phase jitters), ideal time-frequency segregation, and reverberation conditions both in quiet and noisy environments. High correlation (0.97-0.996) is achieved with the proposed metric when evaluated with subjective scores by normal-hearing subjects under various conditions.

Original languageEnglish (US)
Pages (from-to)1762-1778
Number of pages17
JournalJournal of the Acoustical Society of America
Volume150
Issue number3
DOIs
StatePublished - Sep 1 2021

Bibliographical note

Funding Information:
This research was supported by Grant No. FRA-470202-25145 (M.S.A.Z.) and an Single Semester Leave (SSL) and Grant-in-Aid of Research, Artistry, and Scholarship from the University of Minnesota (E.E.D.-V.). This work was also supported by a National Institute on Deafness and Other Communication Disorders (NIDCD) Grant No. R01 DC016839-02. We would also like to acknowledge the insightful comments and suggestions from two anonymous reviewers in the preparation of this manuscript.

Publisher Copyright:
© 2021 Acoustical Society of America.

PubMed: MeSH publication types

  • Journal Article
  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

Fingerprint

Dive into the research topics of 'An intrusive method for estimating speech intelligibility from noisy and distorted signals'. Together they form a unique fingerprint.

Cite this