Combining sparse NMF with deep neural network: A new classification-based approach for speech enhancement

Hung Wei Tseng, Mingyi Hong, Zhi Quan Luo

Research output: Chapter in Book/Report/Conference proceedingConference contribution

18 Scopus citations

Abstract

In this work, we consider enhancing a target speech from a singlechannel noisy observation corrupted by non-stationary noises at low signal-to-noise ratios (SNRs). We take a classification-based approach, where the objective is to estimate an Ideal Binary Mask (IBM) that classifies each time-frequency (T-F) unit of the noisy observation into one of the two categories: speech-dominant unit or noise-dominant unit. The estimated mask is used to binary weight the noisy mixture to obtain the enhanced speech. In the proposed system, the sparse non-negative matrix factorization (NMF) is used to extract features from the noisy observation, followed by a Deep Neural Network (DNN) for classification. Compared with several existing classification-based systems, the proposed system uses minimal speech-specific domain knowledge, but is able to achieve better performance in certain low SNR regions. Moreover, the proposed system outperforms the traditional statistical method, especially in terms of improving the intelligibility.

Original languageEnglish (US)
Title of host publication2015 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2015 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages2145-2149
Number of pages5
ISBN (Electronic)9781467369978
DOIs
StatePublished - Aug 4 2015
Event40th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2015 - Brisbane, Australia
Duration: Apr 19 2014Apr 24 2014

Publication series

NameICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Volume2015-August
ISSN (Print)1520-6149

Other

Other40th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2015
Country/TerritoryAustralia
CityBrisbane
Period4/19/144/24/14

Bibliographical note

Publisher Copyright:
© 2015 IEEE.

Keywords

  • Speech enhancement
  • deep neural network (DNN)
  • non-negative matrix factorization (NMF)
  • sparse coding

Fingerprint

Dive into the research topics of 'Combining sparse NMF with deep neural network: A new classification-based approach for speech enhancement'. Together they form a unique fingerprint.

Cite this