Abstract:
The spectral subtraction based algorithms are commonly used for single channel
speech enhancement because of their elegant performance in denoising with
low cOlllputationalload. They, however, suffer from a serious drawback in that
the enhanced speech is accompanied by unpleasant musical noise artifact, which
is characterized by tones with random frequencies. It is known that the key
point behind the reduction of musical noise by the minimum-mean-squared-error
(MMSE) estimator is the use of a priori SNR. The "decision-directed" approach
widely used for its estimation requires an averaging parameter. Conventionally,
a constant value is chosen by most researchers. The main objective of this work
is the development of a self-adaptive smoothing parameter in the MMSE sense to
estimate the a priori SNR in the DCT domain which can account for the abrupt
changes in the speech spectral amplitudes. The performance improvement using
the proposed self-adaptive smoothing parameter in the commonly used spectral
subtraction algorithms for denoising speech corrupted by background noise is
noteworthy.
The conventional Wiener filtering shows better denoising performance in terms
of overall and average segmental SNRs with the cost paid in Itakura-Saito (IS)
measure as compared to the spectral subtraction based methods. In this work, a
generalized Wiener filter is proposed to improve the IS measure without sacrificing
enhanced speech quality in terms of SNR by introducing a new term in the gain
function. A comparative study with the spectral subtraction algorithms and the
conventional Wiener filter confirms the superiority of the proposed generalized
Wiener filter.