Detection and Analysis of Phishing Attacks
| dc.contributor.author | Cui, Qian | |
| dc.contributor.supervisor | Jourdan, Guy-Vincent | |
| dc.contributor.supervisor | Onut, Iosif Viorel | |
| dc.date.accessioned | 2019-11-29T13:40:32Z | |
| dc.date.available | 2019-11-29T13:40:32Z | |
| dc.date.issued | 2019-11-29 | en_US |
| dc.description.abstract | The so-called "phishing attacks" are attacks in which a legitimate website is impersonated, in order to steal sensitive information from end-users. Phishing attacks represent one of the important threats to individuals and corporations in today's Internet. This problem has been actively researched by both academia and the industry over the past few years. Attempts to provide effective anti-phishing solutions have followed two main approaches: The first one is to identify a phishing attack by comparing its similarity to the target site. The second approach is to look at intrinsic characteristics of the attacks. In this thesis, we first look at this problem from a new angle. Instead of using the intrinsic characteristics of an attack or of comparing the similarity between attacks and target sites, we go back to the source of the problem. We perform an in-depth analysis of how phishing attacks are being built by the attackers. We show that most phishing attacks are duplicates or quasi-duplicates of former attacks. Given that phishing attacks are not built from scratch, we propose two clustering-based methods to evaluate the similarity between attacks. When comparing a newly reported attack against our database of known ones, our method achieves an accuracy of at least 90%, with a false-positive rate of 0.65%. We then explore the evolution of phishing attacks and track variations over time. Our aim is to better understand what attackers do change, and why, across iterations of the attack. We propose a graph-based model in order to monitor and analyze theses changes and their relations. In addition to the detection and analysis of phishing attacks on the client-side, we also explore the server-side aspect of phishing. We conduct a static analysis of the source code of "phishing kits" and propose an approach to track stolen information. Since most phishing attacks use email as the means to exfiltrate stolen information, we propose a deep learning model to detect these messages in network traffic. This approach can be used to easily detect that a phishing attack is hosted inside a large network for example. The third and final contribution of this thesis is a "blind" phishing scanning system, which is used to search for and identify unreported phishing attacks at large scale. The only input of that system is a very large list of domain names. In order to efficiently handle the list, we propose a ranking algorithm which combines natural language processing and machine learning techniques to prioritize the domains that are most likely to be harmful. We then mine our extensive, real-time phishing attack database to guess possible URLs of attacks on these domains and use our own detection algorithm for eventual detection. | en_US |
| dc.identifier.uri | http://hdl.handle.net/10393/39891 | |
| dc.identifier.uri | http://dx.doi.org/10.20381/ruor-24130 | |
| dc.language.iso | en | en_US |
| dc.publisher | Université d'Ottawa / University of Ottawa | en_US |
| dc.subject | Phishing | en_US |
| dc.subject | SVM | en_US |
| dc.subject | Clustering | en_US |
| dc.subject | LSTM | en_US |
| dc.title | Detection and Analysis of Phishing Attacks | en_US |
| dc.type | Thesis | en_US |
| thesis.degree.discipline | Génie / Engineering | en_US |
| thesis.degree.level | Doctoral | en_US |
| thesis.degree.name | PhD | en_US |
| uottawa.department | Science informatique et génie électrique / Electrical Engineering and Computer Science | en_US |
