Abstract:
Performance of the thresholding based speech enhancement methods largely depend on the estimate of the exact threshold value as well as on the choice of the thresholding function. The main challenge in such denoising approaches based on the thresholding of the wavelet coefficients of the noisy speech is the estimation of a threshold value that marks a difference between the wavelet coefficients of noise and that of clean speech. Then, by using the threshold, the designing of a thresholding scheme to minimize the effect of wavelet coefficients corresponding to the noise is another difficult task considering the fact that the conventional DWT based denoising approaches exhibit a satisfactory performance only at a relatively high signal-to-noise ratio (SNR).In order to handle the practical situations of real life applications, a speech enhancement method, apart from providing simplicity in computation, is needed to be capable of producing optimal results with improved overall speech quality with minimized speech intelligibility loss under low levels of SNR. Thus, in severe noisy conditions, development of a speech enhancement method that offers the know-how of determining an appropriate threshold value as well as designing an effective thresholding scheme is still an open problem. In this thesis, a speech enhancement method is presented, in which a custom thresholding function is proposed and employed upon the Wavelet Packet (WP) coefficients of the noisy speech. The thresholding function is capable of switching between modified hard and semisoft thresholding functions depending on a parameter that decides the signal characteristics under consideration. Here, the threshold is determined based on the statistical modeling of the Teager energy operated WP coefficients of the noisy speech. Extensive simulations indicate that the threshold thus obtained in conjunction with the custom thresholding function is very effective in reduction of not only the white noise but also the color noise from the noisy speech thus resulting in an enhanced speech with better quality and intelligibility. Several standard objective measures and subjective evaluations including informal listening tests show that the proposed method outperforms the recent state-of-the-art thresholding based approaches of noisy speech enhancement from high to low levels of SNR.