Automated acquisition of technical concepts from unrestricted English text using noun phrase classification.
En cours de chargement...
Fichiers
Date
Authors
Nom de la revue
ISSN de la revue
Titre du volume
Éditeur
University of Ottawa (Canada)
Résumé
This thesis describes an approach to acquire technical concepts from an English language free text without use of knowledge specific to the domain of expertise described in the text. Only syntactic knowledge and text statistics are used to classify each Noun Phrase in the text into one of five categories of technicality, from Technical to Not Technical. The algorithms devised and their performance are discussed. A secondary topic addressed in this thesis is syntactic category disambiguation. Because the Noun Phrase Classification module requires a Sentence Parser to extract the syntactic structure of each sentence in the text, the syntactic category (noun, verb, preposition, and so on) of each word must appear in the Sentence Parser's Word Dictionary. A syntactic category disambiguation module was designed so that whenever an unknown word (a word which is not defined in the Word Dictionary) is encountered in the text, the disambiguation module attempts to determine its syntactic category automatically using the categories of the neighbouring words with a bottom-up chart parser and text statistics.
Description
Mots-clés
Citation
Source: Masters Abstracts International, Volume: 32-05, page: 1417.
