Formant estimation for noise robust vowel recognition based on spectral domain ramp cepstrum model

BUET ILS
BUET Institutional Repository: Home
→
Dissertations/Theses
→
Dissertations/Theses - Department of Electrical and Electronic Engineering
→
View Item

dc.contributor.advisor	Anowarul Fattah, Dr. Shaikh
dc.contributor.author	Goswami, Rajib
dc.date.accessioned	2016-06-29T04:48:44Z
dc.date.available	2016-06-29T04:48:44Z
dc.date.issued	2012-07
dc.identifier.uri	http://lib.buet.ac.bd:8080/xmlui/handle/123456789/3397
dc.description.abstract	Formants are the distinguishing frequency components of human speech, which play an important role in characterizing di erent voiced sounds. Formant based speech synthe- sis and coding are widely used in several real life applications. such as voice operated controls and telecommunication.In almost all practical applications speech signals are a ected by di erent kinds of background noise and estimation of formants under severe background noise is a di cult task. In this thesis e cient formant estimation is in- vestigated and methods for formant estimation are devised with a view to improve the estimation performance under severe noisy conditions. In order to extract the formant frequencies, rst a strongly voiced portion of the given speech utterance is extracted based on the energy measure. Instead of considering the whole duration of a voiced sound at a time, frame by frame analysis is performed. Within a frame of voiced speech sig- nal, formants can be estimated by using di erent time or frequency domain approaches. Correlation based methods are the most common time domain approaches to estimate formants from speech signals . In linear predictive coding (LPC) based methods, from the autocorrelation function (ACF) of the given speech utterance, Yule-Walker equations are constructed and from their solutions formants can be obtained. Spectral peak pick- ing is another extremely popular method of formant estimation, where both parametric and non parametric spectral estimation techniques are used. Recently cepstrum domain methods has been used in formant estimation . In the presence of heavy background noise, spurious peaks appear in the speech spectrum making the task of accurate formant estimation very di cult. The estimation performance of both time and frequency domain methods deteriorates drastically under heavy noisy conditions.The main goal here is to develop a formant estimation scheme which provides satisfactory performance even at low levels of signal to noise ratio (SNR). In order to reduce the e ect of noise the strength of dominant pole pairs on the spectrum of noisy speech needs to be enhanced. With a view to achieve this objective a spectral domain ramp cepstrum model of autocorrelation function of speech signal is developed. The model utilizes the advantageous property of the ACF that provides better noise immunity in comparison to the noisy signal directly. Transforming to cepstral domain from time domain o ers the advantage of homomorphic deconvolution which can reduce the e ect of pitch in speech analysis. In order to avoid the rapid cepstral decay, instead of cepstrum, ramp cepstrum is used. Since, the pole preserving property of the ramp cepstrum (RC) is better exploited via spectral peaks, the spectrum of RC of the ACF of speech is proposed as the desired model. In order to extract the formants from the observed noisy speech signal utilizing the derived model, model matching scheme is introduced. In the model matching technique, instead of rely- ing on the peak picking, tting error is minimized over a wider peak zone resulting more accurate formant frequency estimation. Finally, the estimated formants are used in vowel recognition scheme as potential features. The linear discriminant based algorithm is used for the purpose of recognition. Extensive experimentation is carried out considering dif- ferent male and female vowel utterances from standard speech database under di erent noisy conditions. It is found that the proposed methods provide a high degree of formant estimation accuracy in comparison to that obtained by some state of the art methods, especially at very low levels of SNR.	en_US
dc.language.iso	en	en_US
dc.publisher	Department of Electrical and Electronic Engineering (EEE)	en_US
dc.subject	Speech synthesis	en_US
dc.title	Formant estimation for noise robust vowel recognition based on spectral domain ramp cepstrum model	en_US
dc.type	Thesis-MSc	en_US
dc.contributor.id	1009062055	en_US
dc.identifier.accessionNumber	111124
dc.contributor.callno	623.99/GOS/2012	en_US