The IntelliTweet: Unveiling Malicious Activities in Tweets Through a Multifaceted Feature Analysis

Dzeha, Eric

The IntelliTweet: Unveiling Malicious Activities in Tweets Through a Multifaceted Feature Analysis

dc.contributor.author	Dzeha, Eric
dc.contributor.supervisor	Jourdan, Guy-Vincent
dc.date.accessioned	2024-05-21T22:11:04Z
dc.date.available	2024-05-21T22:11:04Z
dc.date.issued	2024-05-21
dc.description.abstract	Social media platforms have seamlessly integrated into our daily communication, facilitating information sharing, connections and engagement for both individuals and businesses. Among these platforms, Twitter has emerged as one of the popular platforms for its rapid information dissemination and real-time interaction capabilities. However, the widespread adoption of Twitter has also attracted malicious activities such as phishing, spam, and scams, which take advantage of the platform's extensive reach to spread rapidly. In this Thesis, we introduce "The IntelliTweet," a machine learning system designed to enhance real-time detection and classification of malicious tweets on Twitter. IntelliTweet employs a multifaceted feature approach by integrating content analysis, user profile attributes, sentiment analysis, URL analysis and term frequency-inverse document frequency (TF-IDF) techniques. This holistic methodology considers the contextual nature of tweets, as well as content-based features and user behavior patterns, to accurately distinguish malicious tweets from legitimate ones, including user-reported tweets that raise awareness about threats. Our work began with an in-depth review of existing literature and the landscape of Twitter-centric threats, identifying shortcomings in current detection methodologies ranging from traditional assessments to machine learning classifiers. We subsequently delved into the conceptualization of IntelliTweet as well as the feature design integrating tweet metadata, user profiles, and linguistic nuances within tweets. As part of this work, we created a database by collecting tweets in real-time directly from the Twitter stream. This database contains a mix of malicious tweets, legitimate tweets, and user-reported tweets, allowing us to analyze the interactions between user generated warnings and responses to malicious activities on Twitter. We conducted experiments including model selection, feature importance analysis, grid search optimization, hyperparameter tuning, and t-tests, providing a thorough evaluation of IntelliTweet's performance. Validating both binary and multiclass system configurations, IntelliTweet's precision-centric approach demonstrates reliability and significant improvements with results achieving 98.80% precision, 98.15% F1-score, and a low 0.07 false positive rate on real-world Twitter data. By prioritizing false alarm reduction and maximizing the global precision, IntelliTweet minimizes the mislabeling of legitimate users to account for the real-world implications of user misclassification such as account suspension. IntelliTweet represents a positive step towards Twitter security and positive user experiences, contributing to cybersecurity evolution and providing valuable insights for mitigating emerging threats on the platform. We also suggest in this Thesis some future research directions, including integrating user-centric features and cross-linguistic detection, and considering real-world applications and ethical considerations. It also proposes developing a global, multilingual defense mechanism against digital threats.
dc.identifier.uri	http://hdl.handle.net/10393/46262
dc.identifier.uri	https://doi.org/10.20381/ruor-30359
dc.language.iso	en
dc.publisher	Université d'Ottawa \| University of Ottawa
dc.rights	Attribution-NonCommercial-NoDerivatives 4.0 International	en
dc.rights.uri	http://creativecommons.org/licenses/by-nc-nd/4.0/
dc.subject	Twitter
dc.subject	Malicious tweets
dc.subject	Machine learning
dc.subject	Phishing
dc.subject	Scam
dc.subject	Spam
dc.subject	Features
dc.subject	Text classification
dc.subject	Sentiment analysis
dc.subject	URL Analysis
dc.subject	Obfuscation techniques
dc.subject	Social media
dc.subject	Cybercrime
dc.subject	IntellelliTweet
dc.subject	Phishing Report
dc.subject	Security
dc.title	The IntelliTweet: Unveiling Malicious Activities in Tweets Through a Multifaceted Feature Analysis
dc.type	Thesis	en
thesis.degree.discipline	Génie / Engineering
thesis.degree.level	Masters
thesis.degree.name	MCS
uottawa.department	Science informatique et génie électrique / Electrical Engineering and Computer Science

Fichiers

Trousse originale

Voici les éléments 1 - 1 sur 1

Nom:: Dzeha_Eric_2024_thesis.pdf
Taille:: 8.24 MB
Format:: Adobe Portable Document Format

Télécharger

Trousse de licence

Voici les éléments 1 - 1 sur 1

Nom:: license.txt
Taille:: 6.65 KB
Format:: Item-specific license agreed upon to submission
Description:

Télécharger

Collections

- Thèses, 2011 - // Theses, 2011 -