Repository logo

Model-based Crawling - An Approach to Design Efficient Crawling Strategies for Rich Internet Applications

dc.contributor.authorDincturk, Mustafa Emre
dc.contributor.supervisorJourdan, Guy-Vincent
dc.date.accessioned2013-08-02T20:13:41Z
dc.date.available2013-08-02T20:13:41Z
dc.date.created2013
dc.date.issued2013
dc.degree.disciplineGénie / Engineering
dc.degree.leveldoctorate
dc.degree.namePhD
dc.description.abstractRich Internet Applications (RIAs) are a new generation of web applications that break away from the concepts on which traditional web applications are based. RIAs are more interactive and responsive than traditional web applications since RIAs allow client-side scripting (such as JavaScript) and asynchronous communication with the server (using AJAX). Although these are improvements in terms of user-friendliness, there is a big impact on our ability to automatically explore (crawl) these applications. Traditional crawling algorithms are not sufficient for crawling RIAs. We should be able to crawl RIAs in order to be able to search their content and build their models for various purposes such as reverse-engineering, detecting security vulnerabilities, assessing usability, and applying model-based testing techniques. One important problem is designing efficient crawling strategies for RIAs. It seems possible to design crawling strategies more efficient than the standard crawling strategies, the Breadth-First and the Depth-First. In this thesis, we explore the possibilities of designing efficient crawling strategies. We use a general approach that we called Model-based Crawling and present two crawling strategies that are designed using this approach. We show by experimental results that model-based crawling strategies are more efficient than the standard strategies.
dc.embargo.termsimmediate
dc.faculty.departmentScience informatique et génie électrique / Electrical Engineering and Computer Science
dc.identifier.urihttp://hdl.handle.net/10393/24375
dc.identifier.urihttp://dx.doi.org/10.20381/ruor-3141
dc.language.isoen
dc.publisherUniversité d'Ottawa / University of Ottawa
dc.subjectRich Internet Applications
dc.subjectWeb Crawling
dc.subjectWeb Applications
dc.subjectModeling
dc.subjectModel-based Crawling
dc.subjectAJAX
dc.subjectJavaScript
dc.titleModel-based Crawling - An Approach to Design Efficient Crawling Strategies for Rich Internet Applications
dc.typeThesis
thesis.degree.disciplineGénie / Engineering
thesis.degree.levelDoctoral
thesis.degree.namePhD
uottawa.departmentScience informatique et génie électrique / Electrical Engineering and Computer Science

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail ImageThumbnail Image
Name:
Dincturk_Mustafa_Emre_2013_thesis.pdf
Size:
2.68 MB
Format:
Adobe Portable Document Format
Description:
Main Article

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail ImageThumbnail Image
Name:
license.txt
Size:
4.21 KB
Format:
Item-specific license agreed upon to submission
Description: