Repository logo

Speech recognition in adverse environments: Improvements to IMELDA.

dc.contributor.advisorAboulnasr, Tyseer,
dc.contributor.authorStarks, David Ross.
dc.date.accessioned2009-03-25T19:44:47Z
dc.date.available2009-03-25T19:44:47Z
dc.date.created1995
dc.date.issued1995
dc.degree.levelMasters
dc.degree.nameM.A.Sc.
dc.description.abstractThis thesis deals with speech recognition in adverse environments. The primary problem is the mismatch between training and test conditions. Cases of mismatch include the recording channel, acoustic noise and the speaker. Noise shaping and subband filtering are two noise suppression techniques that work by utilizing properties of speech. Dynamic speech analysis and some feature extraction methods are inherently robust to the influence of noise. Linear discriminant analysis (LDA) can be used to combine disparate sets of speech parameters and obtain the optimal set of features. IMELDA (45) is one such method. In this thesis, we analyse the effectiveness of IMELDA under various training and test scenarios. Theoretical results are first derived and substantiated by simulations. It will be shown that LDA provides a form of noise shaping and the root-deconvolution technique is inappropriate for IMELDA. A new algorithm for predicting recognition performance is proposed and verified. Optimal cross-condition recognition is obtained by utilizing samples of noisy test speech in the within-class covariance, in the so-called QNT IMELDA transform. In the event that the noise is stationary, and can be modelled, we derive an equivalent transform by artificially modifying quiet speech samples. This suffices for the simplest instances. For the extreme helicopter case, we show the best approach to be a combination of band-pass filtering and dynamic analysis of the Mel-scale subbands. Unknown channel noise and additive noise are reduced through the respective subband processing algorithms. Finally, practical issues of applying LDA and integrating subband filtering in a speech recognition system are addressed.
dc.format.extent175 p.
dc.identifier.citationSource: Masters Abstracts International, Volume: 35-05, page: 1493.
dc.identifier.isbn9780612157651
dc.identifier.urihttp://hdl.handle.net/10393/9483
dc.identifier.urihttp://dx.doi.org/10.20381/ruor-16343
dc.publisherUniversity of Ottawa (Canada)
dc.subject.classificationEngineering, Electronics and Electrical.
dc.titleSpeech recognition in adverse environments: Improvements to IMELDA.
dc.typeThesis

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail ImageThumbnail Image
Name:
MM15765.PDF
Size:
3.45 MB
Format:
Adobe Portable Document Format