Repository logo

Development of an information retrieval and distillation agent

Loading...
Thumbnail ImageThumbnail Image

Date

Journal Title

Journal ISSN

Volume Title

Publisher

University of Ottawa (Canada)

Abstract

Though a large number of search engines are commercially available today, the use of most of them often involves tedious human efforts. Also, a large amount of information obtained using the existing search engines may or may not be relevant to the intended query. Furthermore, there is a lack of systematic approach to quantify the value of the information for the user's needs. In this thesis, to free the user from the drudgery of the search and to provide a basis for building personalized database for a particular topic, we develop a web search and distillation agent. To retrieve the information with higher quality, we modified the existing Term frequency vs Inverse Document Frequency (TFIDF) term weighting scheme and combined it with the Hyperlink Induced Topic Search (HITS) method to create a solution measuring both importance and relevancy of a document. To construct a dynamic graph and ensure an affordable continuous search, we propose a Sliding Window Model (SWM) which is used to control the size of the node set of a graph. To improve the intelligence of the search agent, we employ the Exponential Smoothing (ES) approach to guide the search. Our experimental results show that the proposed web search and distillation approach with the above features is effective compared to other algorithms and models: the improved TFIDF algorithm improves the rationality of the search results; the proposed SWM can control the size of the node set as expected; the ES algorithm employed in SWM can further save computing time and help the search agent harvest the information with higher quality, and gains much more advantages compared to other methods implemented in the search agent.

Description

Keywords

Citation

Source: Masters Abstracts International, Volume: 42-06, page: 2237.

Related Materials

Alternate Version