Abstract
In this work, we consider enhancing a target speech from a singlechannel noisy observation corrupted by non-stationary noises at low signal-to-noise ratios (SNRs). We take a classification-based approach, where the objective is to estimate an Ideal Binary Mask (IBM) that classifies each time-frequency (T-F) unit of the noisy observation into one of the two categories: speech-dominant unit or noise-dominant unit. The estimated mask is used to binary weight the noisy mixture to obtain the enhanced speech. In the proposed system, the sparse non-negative matrix factorization (NMF) is used to extract features from the noisy observation, followed by a Deep Neural Network (DNN) for classification. Compared with several existing classification-based systems, the proposed system uses minimal speech-specific domain knowledge, but is able to achieve better performance in certain low SNR regions. Moreover, the proposed system outperforms the traditional statistical method, especially in terms of improving the intelligibility.
Original language | English (US) |
---|---|
Title of host publication | 2015 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2015 - Proceedings |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
Pages | 2145-2149 |
Number of pages | 5 |
ISBN (Electronic) | 9781467369978 |
DOIs | |
State | Published - Aug 4 2015 |
Event | 40th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2015 - Brisbane, Australia Duration: Apr 19 2014 → Apr 24 2014 |
Publication series
Name | ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings |
---|---|
Volume | 2015-August |
ISSN (Print) | 1520-6149 |
Other
Other | 40th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2015 |
---|---|
Country/Territory | Australia |
City | Brisbane |
Period | 4/19/14 → 4/24/14 |
Bibliographical note
Publisher Copyright:© 2015 IEEE.
Keywords
- Speech enhancement
- deep neural network (DNN)
- non-negative matrix factorization (NMF)
- sparse coding