Investigating different models for cross-language information retrieval from automatic speech transcripts

Title: Investigating different models for cross-language information retrieval from automatic speech transcripts
Authors: Alzghool, Muath
Date: 2009
Abstract: Speech information retrieval seeks to facilitate retrieving and accessing spoken content. Speech retrieval combines two techniques: automatic speech recognition (ASR) and information retrieval (IR). An ASR system is first used to transcribe digitized audio into text, and then a text retrieval system is used to retrieve speech segments, given a user request (information need). However, since ASR is an imperfect process, often there are spoken words that are not recognized correctly. This will lead to word mismatches in the retrieval. Early research considered spoken document retrieval for broadcast news as a "solved problem" [1]. However, the problem is still open for other types of speech like spontaneous conversational speech such as interviews, presentations, conferences, meetings, and lectures [2]. Unlike broadcast news (read speech), which has well-defined distinct document units that resembled written documents, spontaneous speech suffers from the lack of clear topic boundaries and poor acoustic conditions. The Cross-Language Speech Retrieval (CL-SR) track at Cross-Language Evaluation Forum (CLEF) provides a collection of oral history interviews. This offers an excellent opportunity to study different speech retrieval techniques for spontaneous speech. The availability of open source IR systems make it possible for us to investigate different Information Retrieval techniques, which proved their effectiveness in the literature for text retrieval, but they were not tested for spontaneous speech retrieval. Moreover, we propose five novel data fusion techniques: the first one combines the results of different models with appropriate weights for each one; the second one uses a cluster-based fusion technique; the third one combines highly-varied retrieval results; the fourth fusion technique is based on a heuristic derivation of the weight for each retrieval strategy; and the last one is based on the probability theory. To deal with the word mismatch problem, we also propose two query expansion methods, one based on collocations, and the other one based on a domain-specific thesaurus. Our system achieved the best results in the CL-SR task at CLEF for two years 2005, and 2007, and it was the second-best system in 2006.
CollectionTh├Ęses, 1910 - 2010 // Theses, 1910 - 2010
NR61247.PDF2.44 MBAdobe PDFOpen