Investigating speed issues in acoustic-phonetic models for continuous speech recognition

En cours de chargement...
Vignette d'image

Date

Nom de la revue

ISSN de la revue

Titre du volume

Éditeur

University of Ottawa (Canada)

Résumé

Automatic Speech Recognition applications face two challenges: accuracy and speed. For good accuracy, Dynamic Programming and Hidden Markov Model algorithms are widely used despite their heavy computational load. To solve the speed problem, this thesis uses a Three-Stage-Architecture (TSA) in which Stage.1 is to enhance and extract features from the input speech signal, Stage.2 does a phonetic-acoustic level recognition to output strings of phonemes to Stage.3 that completes the recognition into valid words using HMM on strings rather than utterances processing. We designed two algorithms for Stage.2: Fast Two-Level Dynamic Programming (FTLDP) that is 20 times faster than a standard Two-Level DP and ParrallelRecognizer that performs 320 times faster than the standard Two-Level DP. Both algorithms are combined with a heuristic feature called Cepstrum Gain Envelop Profile (CGEP) based Silence Detection to shorten the input speech and clustering to reduce the search space in the reference phonetic models.

Description

Mots-clés

Citation

Source: Masters Abstracts International, Volume: 43-06, page: 2324.

Approbation

Évaluation

Complété par

Référencé par