Enhancing Legal Text Entailment: Evaluating Model Architectures, Training Approaches, and Interpretability
| dc.contributor.author | Custeau, Michel | |
| dc.contributor.supervisor | Inkpen, Diana | |
| dc.date.accessioned | 2025-06-27T16:41:32Z | |
| dc.date.available | 2025-06-27T16:41:32Z | |
| dc.date.issued | 2025-06-27 | |
| dc.description.abstract | The legal domain is a challenge for Artificial Intelligence systems as it is characterized by its complex vocabulary, intricate reasoning, and consistency with precedents. With increasing digitization, the potential for Artificial Intelligence to become a useful tool for legal projects has grown significantly. However, adoption within the legal field lags behind other industries due to cultural resistance, limited high-quality training data, and the need for interpretability in black-box systems. To contribute to the advancement of the role of AI in the legal domain, we developed and evaluated a legal entailment classification system which determines whether a paragraph from an existing legal case supports the decision in a new case, while also providing a justification for the classification. Leveraging advanced Natural Language Processing techniques and Explainable AI methodologies, this work integrates domain-specific pretraining, lightweight adaptation techniques such as LoRA, and an ensemble technique. A dataset of over 35,000 Canadian legal cases was used for further pretraining, while fine-tuning and evaluation were performed using the COLIEE 2023 competition dataset. Our experiments highlight the trade-offs between computational efficiency and performance, and evaluate the impact of domain-specific pretraining on smaller transformer models such as RoBERTa, compared to adaptations of larger language models for the classification task. In addition to classification, this thesis explores the role of explainablility techniques for Artificial Intelligence in legal applications by implementing the techniques of LIME and model-generated justifications. These methods were assessed using human evaluations, as well as automated sufficiency metrics, with highest scores in the human evaluation being 82.14% for adequacy, 92.86% for understandability, and 85.71% for trustworthiness, and a peak score of 99.17% for the automated sufficiency metric. The contributions of this research include the creation of domain-specific pretrained models, a comparative evaluation of fine-tuning and lightweight adaptation techniques for large language models, and a systematic exploration of explainability methods to improve interpretability and user trust. To the best of our knowledge, this study is the first within the Canadian legal AI context to investigate the effects of further pretraining on both small and large language models, as well as to integrate language model adaptation and explainability into a unified system for legal text entailment classification. | |
| dc.identifier.uri | http://hdl.handle.net/10393/50600 | |
| dc.identifier.uri | https://doi.org/10.20381/ruor-31202 | |
| dc.language.iso | en | |
| dc.publisher | Université d'Ottawa / University of Ottawa | |
| dc.subject | legal natural language processing | |
| dc.subject | explainable artificial intelligence | |
| dc.subject | legal entailment classification | |
| dc.subject | domain-specific pretraining | |
| dc.subject | transformer language models | |
| dc.title | Enhancing Legal Text Entailment: Evaluating Model Architectures, Training Approaches, and Interpretability | |
| dc.type | Thesis | en |
| thesis.degree.discipline | Génie / Engineering | |
| thesis.degree.level | Masters | |
| thesis.degree.name | MCS | |
| uottawa.department | Science informatique et génie électrique / Electrical Engineering and Computer Science |
