Abstract:
The performance of the detection and classi cation of nasalized vowels from the
mixture of oral and nasalized vowels largely depend on the acoustic features by
which the task is performed. In this thesis, sets of acoustic features are derived from
both tha magnitude and phase spectra of the vowels to evaluate the performance.
It is shown through detail acoustic analysis on nasalized vowels that, although ad-
ditional formants at various frequency locations are introduced, a new resonance in
the low frequency region around 250 Hz is introduced and found to remain consistent
irrespective of male or female speakers in the modi ed group delay function. By ex-
ploiting this fact, which is veri ed on the band limited modi ed group delay function
capable of resolving two closely spaced formants occuring in case of nasalized vowels,
an acoustic parameter RMGD is derived. Utilizing RMGD, the idea of detecting
nasalized vowels and determining the degree of nasality with respect to the adjacent
nasal consonants of the vowel is evolved. It is argued that vowel can be nasalized
with at least one adjacent nasal consonant even if the nasal consonant is pre-vocalic
and the vowel is more nasalized in pre-nasal position than in post-nasal position.
It is also found that vowel with nasal consonants on both side do not guarantee
to be more nasalized vowel compared to the vowel with one adjacent nasal conso-
nant. By utilizing the fact of changing nasality with the number of adjacent nasal
consonant, the detection and classi cation of non-nasalized and contextually nasal-
ized vowels is formulated as a four class problem and solved based on a threshold
or classi er based scheme and found superior in detecting and classifying nasalized
vowels than some of the existing methods. Mel-ferequency Cepstral coe cients are
widely used features for the detection task. Conventionally, features for detecting
and classifying nasalized vowels are derived considering magnitude spectrum only,
ignoring the phase spectrum. Exploiting the power spectrum and the group delay
function of a band limited vowel, the product spectrum is de ned thus incorporat-
ing the information of both magnitude and phase spectra. The product spectrum is
then di erentiated with respect to frequency to obtain di erential product spectrum
that is argued to provide more noise robustness in the presence of noise. Assuming
the noise reduction capability of the autocorrelation sequence power and product spectrum of the band limited autocorrelation sequence of the vowel are developed.
Simulation results show that Mel-frequency cepstral coe cients derived from the
product spectrum, di erential product spectrum, power and product spectrum of
the autocorrelation sequence consitently outperforms the some of the convetional
approaches in the task of detecting and classifying nasalized vowels in both clean
and di erent noisy conditions.