Investigating speed issues in acoustic-phonetic models for continuous speech recognition
En cours de chargement...
Fichiers
Date
Authors
Nom de la revue
ISSN de la revue
Titre du volume
Éditeur
University of Ottawa (Canada)
Résumé
Automatic Speech Recognition applications face two challenges: accuracy and speed. For good accuracy, Dynamic Programming and Hidden Markov Model algorithms are widely used despite their heavy computational load. To solve the speed problem, this thesis uses a Three-Stage-Architecture (TSA) in which Stage.1 is to enhance and extract features from the input speech signal, Stage.2 does a phonetic-acoustic level recognition to output strings of phonemes to Stage.3 that completes the recognition into valid words using HMM on strings rather than utterances processing.
We designed two algorithms for Stage.2: Fast Two-Level Dynamic Programming (FTLDP) that is 20 times faster than a standard Two-Level DP and ParrallelRecognizer that performs 320 times faster than the standard Two-Level DP. Both algorithms are combined with a heuristic feature called Cepstrum Gain Envelop Profile (CGEP) based Silence Detection to shorten the input speech and clustering to reduce the search space in the reference phonetic models.
Description
Mots-clés
Citation
Source: Masters Abstracts International, Volume: 43-06, page: 2324.
