In this work, we consider enhancing a target speech from a singlechannel noisy observation corrupted by non-stationary noises at low signal-to-noise ratios (SNRs). We take a classification-based approach, where the objective is to estimate an Ideal Binary Mask (IBM) that classifies each time-frequency (T-F) unit of the noisy observation into one of the two categories: speech-dominant unit or noise-dominant unit. The estimated mask is used to binary weight the noisy mixture to obtain the enhanced speech. In the proposed system, the sparse non-negative matrix factorization (NMF) is used to extract features from the noisy observation, followed by a Deep Neural Network (DNN) for classification. Compared with several existing classification-based systems, the proposed system uses minimal speech-specific domain knowledge, but is able to achieve better performance in certain low SNR regions. Moreover, the proposed system outperforms the traditional statistical method, especially in terms of improving the intelligibility.