Advancing Cross-Domain Fake News Detection: Enhanced Models to Improve Generalization and Tackle the Class Imbalance Problem
| dc.contributor.author | Alnabhan, Mohammad Qasim Mohammad | |
| dc.contributor.supervisor | Branco, Paula | |
| dc.date.accessioned | 2025-03-12T17:56:40Z | |
| dc.date.available | 2025-03-12T17:56:40Z | |
| dc.date.issued | 2025-03-12 | |
| dc.description.abstract | The rapid proliferation of fake news across domains, such as politics, health, and social media, poses a significant threat to the integrity of information dissemination, leading to misinformation that can affect public perception and decision-making. Detecting fake news is critical to preserving the credibility of media sources, protecting democratic processes, and ensuring public safety. However, as Artificial Intelligence (AI)-driven Fake News Detection (FND) systems become more prevalent, ensuring AI safety is equally important to prevent unintended biases, ethical concerns, and adversarial exploitation of these models. Several challenges hinder the effectiveness of current FND models. Among these, cross-domain generalization and class imbalance are two critical problems that considerably impact the performance of detection systems. Although there are multiple challenges in FND, this thesis focuses on these two problems due to their widespread influence across various datasets and domains. Cross-domain generalization is a complex problem because the characteristics of fake news differ significantly between domains. For example, fake news in politics may be presented in a completely different manner when compared to health-related misinformation. This variation in linguistic and contextual features makes it difficult for models trained in one domain to generalize to another. Still, important patterns can be present across domains and might be useful to detect fake news on different domains, helping with model generalizability. This thesis adopts a domain classification approach to address this issue, allowing the model to classify the domain before applying a domain-specific FND strategy, thereby improving overall generalization. The class imbalance, where fake news is underrepresented in datasets compared to real news, presents another challenge, as models often become biased toward the majority class. This thesis addresses this problem by incorporating advanced Deep Learning (DL) techniques, such as Transfer Learning (TL) and class imbalance mitigation strategies, to ensure that models learn the nuances of detecting fake news even when instances are scarce. The contributions of this thesis are categorized into the following areas: the first contribution is a comprehensive Systematic Literature Review (SLR) of the current DL techniques used in FND. This review highlights the challenges existing methods face, particularly with respect to cross-domain generalization and class imbalance, laying the foundation for developing novel approaches. The second contribution is developing a two-tiered, multi-domain FND framework. This framework integrates domain classification with FND, enhancing the model's ability to generalize across diverse domains, in addition to the mitigation of the imbalance issue by navigating various techniques. The third contribution is the introduction of a fine-tuned, prompt-tuned Llama 2 model. This model leverages pre-trained knowledge from Large Language Models (LLMs) and adapts it to the specific task of FND, improving performance by focusing on domain-specific characteristics. This thesis has an extensive evaluation of the proposed approaches using publicly available datasets from various domains, including politics, health, science, crime, and social media. The results demonstrate that our methods significantly improve the robustness and accuracy of the FND models, providing benchmarks for future research in this field. Additionally, the findings emphasize the importance of AI safety in FND, ensuring that models are not only effective but also fair, explainable, and resistant to adversarial manipulation. These contributions advance the state-of-the-art of FND and establish a more resilient framework for detecting fake news across multiple domains, addressing critical challenges that have previously limited the generalization and reliability of detection systems. | |
| dc.identifier.uri | http://hdl.handle.net/10393/50256 | |
| dc.identifier.uri | https://doi.org/10.20381/ruor-30972 | |
| dc.language.iso | en | |
| dc.publisher | Université d'Ottawa / University of Ottawa | |
| dc.subject | Large Language Model | |
| dc.subject | Deep Learning | |
| dc.subject | Transfer Learning | |
| dc.subject | Class Imbalance | |
| dc.subject | Applied Artificial Intelligence (AI) | |
| dc.subject | AI Safety | |
| dc.subject | Model Generalizability | |
| dc.title | Advancing Cross-Domain Fake News Detection: Enhanced Models to Improve Generalization and Tackle the Class Imbalance Problem | |
| dc.type | Thesis | en |
| thesis.degree.discipline | Génie / Engineering | |
| thesis.degree.level | Doctoral | |
| thesis.degree.name | PhD | |
| uottawa.department | Science informatique et génie électrique / Electrical Engineering and Computer Science |
