Pitch extraction of noisy speech using dominant frequency of the harmonic speech model

BUET ILS
BUET Institutional Repository: Home
→
Dissertations/Theses
→
Dissertations/Theses - Department of Electrical and Electronic Engineering
→
View Item

dc.contributor.advisor	Kamrul Hasan, Dr. Md.
dc.contributor.author	Celia Shahnaz
dc.date.accessioned	2015-10-14T06:25:47Z
dc.date.available	2015-10-14T06:25:47Z
dc.date.issued	2002-12
dc.identifier.uri	http://lib.buet.ac.bd:8080/xmlui/handle/123456789/1010
dc.description.abstract	This thesis work is concerned with the development of a new method for pitch extraction from the observations of speech heavily degraded by white noise. The extraction of pitch at a very low SNR is still a challenging problem for researchers and no effective method has yet been reported. To estimate pitch all the existing autocorrelation based techniques, invariably use the location of the second largest peak at the speech signal periodicity relative to the strongest peak at the true origin of the autocorrelation function. At a very low SNR, this approaches are likely to give gross pitch error due to the presence of spurious peaks obscuring the pitch peak. Thus the conventional methods fail to estimate pitch below a certain positive value of SNR with acceptable accuracy. In this research work, unlike conventional approaches, we propose a noise robust method for pitch estimation using the dominant harmonic (DH) in the harmonic sinusoidal speech model. To estimate the DH accurately under strong noisy condition, a cosine autocorrelation model (CAM) of the noise-free speech signal is proposed. Moreover, a sinusoidal autocorrelation model (SAM) of the clean-speech using an approximate all-pole speech model is derived for more accurate estimation of the DH. The harmonic number of the DH required to estimate pitch is then determined based on function optimization. This utilizes the inner product of a variable period short-length impulse train with the half-wave rectified weighted autocorrelation function of the pre-filtered noisy speech (PFNS) signal. As the DH is an integer multiple of pitch, the period of the impulse train is tuned as a function of the DH. Numerical analysis reveals that the traditional idea of "reliable pitch estimation depending solely on the correct determination of the second largest peak of the autocorrelation function" can be relaxed. In this work pitch estimation accuracy is highly dependent on the accuracy of estimation of the DH, a task which seems to be easier than the conventional methods. The simulation results show that the novel method presented in this work can estimate pitch with higher accuracy even at an SNR as low as -5 dB as compared to other recently reported results.	en_US
dc.language.iso	en	en_US
dc.publisher	Department of Electrical and Electronic Engineering	en_US
dc.subject	Pitch extraction of noisy speech	en_US
dc.subject	Frequency of the harmonic speech model	en_US
dc.title	Pitch extraction of noisy speech using dominant frequency of the harmonic speech model	en_US
dc.type	Thesis-MSc	en_US
dc.contributor.id	100006201 P	en_US
dc.identifier.accessionNumber	97098
dc.contributor.callno	623.8043/CEL/2002	en_US