Intelligent document format: A text encoding scheme.
Loading...
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
University of Ottawa (Canada)
Abstract
The issue of text representation is very important in text retrieval and natural language processing. The way the data is represented can significantly affect the efficiency of storage, retrieval, routing techniques, query formulation, and information extraction. This thesis describes a potential solution to encode documents efficiently and effectively so that information may be easily retrieved. In this thesis, we present a technique for encoding textual data in a representation format called the Intelligent Document Format (IDF). The IDF encodes English text so as to store textual data efficiently, permitting retrieval of text at sentence-, paragraph-, and document levels, and assisting term searching and retrieving as well as providing linguistic processing such as morphological analysis and sense disambiguation. To illustrate the IDF encoding method, we describe IDFconvert, an IDF encoder and decoder program, and carry out encoding experiments on the electronic-text version of Dracula (Stoker 1897).
Description
Keywords
Citation
Source: Masters Abstracts International, Volume: 36-01, page: 0212.
