"Roget's Thesaurus" as a lexical resource for natural language processing
En cours de chargement...
Fichiers
Date
Authors
Nom de la revue
ISSN de la revue
Titre du volume
Éditeur
University of Ottawa (Canada)
Résumé
This dissertation presents an implementation of an electronic lexical knowledge base that uses the 1987 Penguin edition of Roget's Thesaurus as the source for its lexical material---the first implementation of a computerized Roget's to use an entire current edition. It explains the steps necessary for taking a machine-readable file and transforming it into a tractable system. Roget's organization is studied in detail and contrasted with WordNet's.
We show two applications of the computerized Thesaurus: computing semantic similarity between words and phrases, and building lexical chains in a text. The experiments are performed using well-known benchmarks and the results are compared to those of other systems that use Roget's, WordNet and statistical techniques. Roget's has turned out to be an excellent resource for measuring semantic similarity; lexical chains are easily built but more difficult to evaluate. We also explain ways in which Roget's Thesaurus and WordNet can be combined.
Description
Mots-clés
Citation
Source: Masters Abstracts International, Volume: 42-06, page: 2233.
