Repository logo

Speech feature estimation under the presence of noise with switching Kalman Filter methods

Loading...
Thumbnail ImageThumbnail Image

Date

Journal Title

Journal ISSN

Volume Title

Publisher

University of Ottawa (Canada)

Abstract

The performance degradation of a speech recognizer in the presence of additive noise is one of the major problems that still remain unsolved in the application of speech recognition technology. This thesis develops speech enhancement schemes where a noisy speech signal is processed in the feature extraction stage. Since the most popular speech features for speech recognition are Mel-Frequency Cepstral Coefficients (MFCC), vectors of Mel-scaled log-spectrum coefficients or cepstrum coefficients are enhanced. Three different speech feature enhancement schemes based on switching linear dynamic models (SLDMs) are proposed. The switching linear dynamic model describes the nonlinear and non-stationary time trajectory of speech features by switching among a set of linear dynamic models over time. With the resulting SLDMs as a speech model and a model for noise, speech and noise can be tracked jointly by means of switching Kalman filtering, which involves a weighted sum of filters operating interactively in parallel. Since the distortion caused by additive ambient noises is highly non-linear in the feature domain, the Extended Kalman Filter algorithm (EKF) and the Unscented Kalman Filter algorithm (UKF) have been used to deal with the nonlinear distortion caused by noise in the feature domain. Comprehensive experiments have been carried out to evaluate the proposed schemes with commonly used databases. The simulation results are presented and compared with other model-based feature enhancement systems in the literature in terms of speech recognition accuracy. Compared with the best results based on the Aurora2 database that we could find in the literature, our approach offers a similar performance (85.96% vs. 86.72%) when the EKF is used in our proposed method, resulting in a smaller complexity. When the UKF is used with our method, our approach achieves a better performance (89.60% vs. 86.72%).

Description

Keywords

Citation

Source: Dissertation Abstracts International, Volume: 70-07, Section: B, page: 4375.

Related Materials

Alternate Version