Repository logo

"Roget's Thesaurus" as a lexical resource for natural language processing

dc.contributor.advisorSzpakowocz, Stan,
dc.contributor.authorJarmasz, Mario
dc.date.accessioned2013-11-07T17:24:40Z
dc.date.available2013-11-07T17:24:40Z
dc.date.created2003
dc.date.issued2003
dc.degree.levelMasters
dc.degree.nameM.C.S.
dc.description.abstractThis dissertation presents an implementation of an electronic lexical knowledge base that uses the 1987 Penguin edition of Roget's Thesaurus as the source for its lexical material---the first implementation of a computerized Roget's to use an entire current edition. It explains the steps necessary for taking a machine-readable file and transforming it into a tractable system. Roget's organization is studied in detail and contrasted with WordNet's. We show two applications of the computerized Thesaurus: computing semantic similarity between words and phrases, and building lexical chains in a text. The experiments are performed using well-known benchmarks and the results are compared to those of other systems that use Roget's, WordNet and statistical techniques. Roget's has turned out to be an excellent resource for measuring semantic similarity; lexical chains are easily built but more difficult to evaluate. We also explain ways in which Roget's Thesaurus and WordNet can be combined.
dc.format.extent220 p.
dc.identifier.citationSource: Masters Abstracts International, Volume: 42-06, page: 2233.
dc.identifier.urihttp://hdl.handle.net/10393/26493
dc.identifier.urihttp://dx.doi.org/10.20381/ruor-18213
dc.language.isoen
dc.publisherUniversity of Ottawa (Canada)
dc.subject.classificationLanguage, Linguistics.
dc.subject.classificationArtificial Intelligence.
dc.subject.classificationComputer Science.
dc.title"Roget's Thesaurus" as a lexical resource for natural language processing
dc.typeThesis

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail ImageThumbnail Image
Name:
MQ90084.PDF
Size:
11.05 MB
Format:
Adobe Portable Document Format