Prioritization and Evolution of Regression Tests: Data-Driven Solutions for Continuous Integration Contexts

Saboor Yaraghi, Ahmadreza

Prioritization and Evolution of Regression Tests: Data-Driven Solutions for Continuous Integration Contexts

dc.contributor.author	Saboor Yaraghi, Ahmadreza
dc.contributor.supervisor	Briand, Lionel C.
dc.date.accessioned	2025-05-14T17:44:17Z
dc.date.available	2025-05-14T17:44:17Z
dc.date.issued	2025-05-14
dc.description.abstract	Regression testing is a software testing method used to ensure that existing functionalities work correctly when changes are applied to a software system. It involves executing all or a subset of test cases. In Continuous Integration (CI) contexts, where code changes are frequent, regression testing is a critical process in ensuring software quality. However, the high time and resource costs of executing regression tests, coupled with the challenges of evolving test cases to align with rapidly changing software, pose significant challenges. This thesis addresses these challenges by optimizing the prioritization of regression tests and supporting their evolution following code changes. The first part of the thesis focuses on Test Case Prioritization (TCP) to enhance the efficiency of regression testing in CI contexts. Many recent TCP studies employ Machine Learning (ML) techniques to address the dynamic and complex nature of CI. However, most of them use a limited number of features for training ML models and evaluate the models on subjects for which the application of TCP offers little practical value, due to their small regression testing time and low number of failed builds. This thesis begins by defining a data model that represents the data sources and their relationships within a CI environment. We used this model as the basis for identifying and collecting a comprehensive set of features, encompassing all features previously used in related studies. We collect these features from 25 open-source projects with significantly larger regression testing times and higher numbers of failed builds than existing benchmarks. Extensive experiments are conducted to assess the performance of ML-based TCP, evaluating its effectiveness, feature collection costs, and the decay rate of model performance over time. Additionally, this study examines the trade-offs between feature collection costs and TCP effectiveness, providing practical insights for deploying ML-based TCP in real-world CI environments. The second part of the thesis introduces TaRGET (Test Repair GEneraTor), a novel approach for automating the repair of broken test cases. Ensuring the quality of software systems through regression testing is essential, yet maintaining test cases poses significant challenges and costs. The need for frequent updates to align with the evolving system under test (SUT) often entails high complexity and cost for maintaining these test cases. Broken test cases, if left unrepaired, degrade the quality of test suites, disrupt the development process, and waste developers' time. TaRGET leverages pre-trained code language models (CLMs) and treats test repair as a language translation task. It employs a two-step process that integrates essential contextual information, such as SUT code changes, and fine-tunes CLMs to enhance the accuracy and relevance of the generated repairs. To validate TaRGET, the thesis introduces TaRBench, a comprehensive benchmark containing over 45,000 test repair instances across 59 open-source projects, addressing the limitations of existing benchmarks, which often have a small number of instances or projects and lack diversity in test repair scenarios. Experimental results demonstrate that TaRGET achieves a 66.1% exact match accuracy in repairing test cases. The study further examines TaRGET's effectiveness across different test repair scenarios using both quantitative and qualitative analysis. Practical guidelines are also provided for identifying scenarios where the generated repairs might be less reliable. The study also examines the effectiveness of repair generation for new projects without fine-tuning and evaluates the necessity of project-specific fine-tuning for achieving acceptable performance. By combining advanced ML techniques with practical tools and benchmarks, this thesis provides a unified framework for supporting the prioritization and evolution of regression tests. To enable reproducibility and reusability, all experimental data, tools, and benchmarks are made publicly available for practitioners and researchers.
dc.identifier.uri	http://hdl.handle.net/10393/50482
dc.identifier.uri	https://doi.org/10.20381/ruor-31122
dc.language.iso	en
dc.publisher	Université d'Ottawa / University of Ottawa
dc.rights	Attribution-ShareAlike 4.0 International	en
dc.rights.uri	http://creativecommons.org/licenses/by-sa/4.0/
dc.subject	continuous integration
dc.subject	software testing
dc.subject	test case prioritization
dc.subject	machine learning
dc.subject	language models
dc.subject	automated test case repair
dc.subject	test case maintenance
dc.subject	test case evolution
dc.subject	fine-tuning
dc.title	Prioritization and Evolution of Regression Tests: Data-Driven Solutions for Continuous Integration Contexts
dc.type	Thesis	en
thesis.degree.discipline	Génie / Engineering
thesis.degree.level	Doctoral
thesis.degree.name	PhD
uottawa.department	Science informatique et génie électrique / Electrical Engineering and Computer Science

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Saboor_Yaraghi_Ahmadreza_2025_thesis.pdf
Size:: 1.91 MB
Format:: Adobe Portable Document Format

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 6.65 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

- Thèses, 2011 - // Theses, 2011 -