Model-based Crawling - An Approach to Design Efficient Crawling Strategies for Rich Internet Applications

Description
Title: Model-based Crawling - An Approach to Design Efficient Crawling Strategies for Rich Internet Applications
Authors: Dincturk, Mustafa Emre
Date: 2013
Abstract: Rich Internet Applications (RIAs) are a new generation of web applications that break away from the concepts on which traditional web applications are based. RIAs are more interactive and responsive than traditional web applications since RIAs allow client-side scripting (such as JavaScript) and asynchronous communication with the server (using AJAX). Although these are improvements in terms of user-friendliness, there is a big impact on our ability to automatically explore (crawl) these applications. Traditional crawling algorithms are not sufficient for crawling RIAs. We should be able to crawl RIAs in order to be able to search their content and build their models for various purposes such as reverse-engineering, detecting security vulnerabilities, assessing usability, and applying model-based testing techniques. One important problem is designing efficient crawling strategies for RIAs. It seems possible to design crawling strategies more efficient than the standard crawling strategies, the Breadth-First and the Depth-First. In this thesis, we explore the possibilities of designing efficient crawling strategies. We use a general approach that we called Model-based Crawling and present two crawling strategies that are designed using this approach. We show by experimental results that model-based crawling strategies are more efficient than the standard strategies.
URL: http://hdl.handle.net/10393/24375
http://dx.doi.org/10.20381/ruor-3141
CollectionThèses, 2011 - // Theses, 2011 -
Files
Dincturk_Mustafa_Emre_2013_thesis.pdfMain Article2.75 MBAdobe PDFOpen