Abstract:
This thesis work is concerned with the development of a new method for pitch
extraction from the observations of speech heavily degraded by white noise. The
extraction of pitch at a very low SNR is still a challenging problem for researchers
and no effective method has yet been reported. To estimate pitch all the existing
autocorrelation based techniques, invariably use the location of the second largest
peak at the speech signal periodicity relative to the strongest peak at the true
origin of the autocorrelation function. At a very low SNR, this approaches are
likely to give gross pitch error due to the presence of spurious peaks obscuring
the pitch peak. Thus the conventional methods fail to estimate pitch below a
certain positive value of SNR with acceptable accuracy.
In this research work, unlike conventional approaches, we propose a noise
robust method for pitch estimation using the dominant harmonic (DH) in the
harmonic sinusoidal speech model. To estimate the DH accurately under strong
noisy condition, a cosine autocorrelation model (CAM) of the noise-free speech
signal is proposed. Moreover, a sinusoidal autocorrelation model (SAM) of the
clean-speech using an approximate all-pole speech model is derived for more accurate
estimation of the DH. The harmonic number of the DH required to estimate
pitch is then determined based on function optimization. This utilizes the inner
product of a variable period short-length impulse train with the half-wave rectified
weighted autocorrelation function of the pre-filtered noisy speech (PFNS)
signal. As the DH is an integer multiple of pitch, the period of the impulse train
is tuned as a function of the DH. Numerical analysis reveals that the traditional
idea of "reliable pitch estimation depending solely on the correct determination
of the second largest peak of the autocorrelation function" can be relaxed. In this
work pitch estimation accuracy is highly dependent on the accuracy of estimation
of the DH, a task which seems to be easier than the conventional methods. The
simulation results show that the novel method presented in this work can estimate
pitch with higher accuracy even at an SNR as low as -5 dB as compared
to other recently reported results.