Repository logo

Component-Based Crawling of Complex Rich Internet Applications

dc.contributor.authorMoosavi Byooki, Seyed Ali
dc.contributor.supervisorJourdan, Guy-Vincent
dc.contributor.supervisorOnut, Iosif-Viorel
dc.date.accessioned2014-02-07T21:16:12Z
dc.date.available2014-02-07T21:16:12Z
dc.date.created2014
dc.date.issued2014
dc.degree.disciplineGénie / Engineering
dc.degree.levelmasters
dc.degree.nameMSc
dc.description.abstractDuring the past decade, web applications have evolved substantially. Taking advantage of new technologies, Rich Internet Applications (RIAs) make heavy use of client side code to present content. Web crawlers, however, face new challenges in crawling RIAs, such as how to explore and identify different client states. The problem of crawling RIAs has been a focus for researchers during recent years, and solutions have been proposed based on constructing a state-transition model with DOMs as states and JavaScript events as transitions. When faced with real-life RIAs, however, a major problem prevalent in current solutions is state space explosion caused by the complexity of the RIAs. This problem prevents the automated crawlers from being usable on complex RIAs as they fail to produce useful results in a timely fashion. This research addresses the challenge of efficiently crawling complex RIAs with two main ideas: component-based crawling and similarity detection. Our experimental results show that these ideas lead to a drastic reduction of the time required to produce results, enabling the crawler to explore RIAs previously too complex for automated crawl.
dc.embargo.termsimmediate
dc.faculty.departmentScience informatique et génie électrique / Electrical Engineering and Computer Science
dc.identifier.urihttp://hdl.handle.net/10393/30636
dc.identifier.urihttp://dx.doi.org/10.20381/ruor-3546
dc.language.isoen
dc.publisherUniversité d'Ottawa / University of Ottawa
dc.subjectAJAX
dc.subjectcrawl
dc.subjectria
dc.titleComponent-Based Crawling of Complex Rich Internet Applications
dc.typeThesis
thesis.degree.disciplineGénie / Engineering
thesis.degree.levelMasters
thesis.degree.nameMSc
uottawa.departmentScience informatique et génie électrique / Electrical Engineering and Computer Science

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail ImageThumbnail Image
Name:
Moosavi_Byooki_Seyed_Ali_2014_thesis.pdf
Size:
2.39 MB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail ImageThumbnail Image
Name:
license.txt
Size:
4.21 KB
Format:
Item-specific license agreed upon to submission
Description: