The IntelliTweet: Unveiling Malicious Activities in Tweets Through a Multifaceted Feature Analysis
| dc.contributor.author | Dzeha, Eric | |
| dc.contributor.supervisor | Jourdan, Guy-Vincent | |
| dc.date.accessioned | 2024-05-21T22:11:04Z | |
| dc.date.available | 2024-05-21T22:11:04Z | |
| dc.date.issued | 2024-05-21 | |
| dc.description.abstract | Social media platforms have seamlessly integrated into our daily communication, facilitating information sharing, connections and engagement for both individuals and businesses. Among these platforms, Twitter has emerged as one of the popular platforms for its rapid information dissemination and real-time interaction capabilities. However, the widespread adoption of Twitter has also attracted malicious activities such as phishing, spam, and scams, which take advantage of the platform's extensive reach to spread rapidly. In this Thesis, we introduce "The IntelliTweet," a machine learning system designed to enhance real-time detection and classification of malicious tweets on Twitter. IntelliTweet employs a multifaceted feature approach by integrating content analysis, user profile attributes, sentiment analysis, URL analysis and term frequency-inverse document frequency (TF-IDF) techniques. This holistic methodology considers the contextual nature of tweets, as well as content-based features and user behavior patterns, to accurately distinguish malicious tweets from legitimate ones, including user-reported tweets that raise awareness about threats. Our work began with an in-depth review of existing literature and the landscape of Twitter-centric threats, identifying shortcomings in current detection methodologies ranging from traditional assessments to machine learning classifiers. We subsequently delved into the conceptualization of IntelliTweet as well as the feature design integrating tweet metadata, user profiles, and linguistic nuances within tweets. As part of this work, we created a database by collecting tweets in real-time directly from the Twitter stream. This database contains a mix of malicious tweets, legitimate tweets, and user-reported tweets, allowing us to analyze the interactions between user generated warnings and responses to malicious activities on Twitter. We conducted experiments including model selection, feature importance analysis, grid search optimization, hyperparameter tuning, and t-tests, providing a thorough evaluation of IntelliTweet's performance. Validating both binary and multiclass system configurations, IntelliTweet's precision-centric approach demonstrates reliability and significant improvements with results achieving 98.80% precision, 98.15% F1-score, and a low 0.07 false positive rate on real-world Twitter data. By prioritizing false alarm reduction and maximizing the global precision, IntelliTweet minimizes the mislabeling of legitimate users to account for the real-world implications of user misclassification such as account suspension. IntelliTweet represents a positive step towards Twitter security and positive user experiences, contributing to cybersecurity evolution and providing valuable insights for mitigating emerging threats on the platform. We also suggest in this Thesis some future research directions, including integrating user-centric features and cross-linguistic detection, and considering real-world applications and ethical considerations. It also proposes developing a global, multilingual defense mechanism against digital threats. | |
| dc.identifier.uri | http://hdl.handle.net/10393/46262 | |
| dc.identifier.uri | https://doi.org/10.20381/ruor-30359 | |
| dc.language.iso | en | |
| dc.publisher | Université d'Ottawa | University of Ottawa | |
| dc.rights | Attribution-NonCommercial-NoDerivatives 4.0 International | en |
| dc.rights.uri | http://creativecommons.org/licenses/by-nc-nd/4.0/ | |
| dc.subject | ||
| dc.subject | Malicious tweets | |
| dc.subject | Machine learning | |
| dc.subject | Phishing | |
| dc.subject | Scam | |
| dc.subject | Spam | |
| dc.subject | Features | |
| dc.subject | Text classification | |
| dc.subject | Sentiment analysis | |
| dc.subject | URL Analysis | |
| dc.subject | Obfuscation techniques | |
| dc.subject | Social media | |
| dc.subject | Cybercrime | |
| dc.subject | IntellelliTweet | |
| dc.subject | Phishing Report | |
| dc.subject | Security | |
| dc.title | The IntelliTweet: Unveiling Malicious Activities in Tweets Through a Multifaceted Feature Analysis | |
| dc.type | Thesis | en |
| thesis.degree.discipline | Génie / Engineering | |
| thesis.degree.level | Masters | |
| thesis.degree.name | MCS | |
| uottawa.department | Science informatique et génie électrique / Electrical Engineering and Computer Science |
