Black-box Test Suite Minimization

Pan, Rongqi

Black-box Test Suite Minimization

dc.contributor.author	Pan, Rongqi
dc.contributor.supervisor	Briand, Lionel
dc.date.accessioned	2024-12-19T21:31:23Z
dc.date.available	2024-12-19T21:31:23Z
dc.date.issued	2024-12-19
dc.description.abstract	In software testing, executing large test suites is time and resource-consuming, sometimes impossible, and such test suites typically contain many redundant test cases that (mostly) find the same faults. Hence, test suite minimization (TSM) is used to remove redundant test cases that are unlikely to detect new faults. However, most TSM techniques rely on code coverage (white-box), model-based features, or requirements specifications, which are not always (entirely) accessible by test engineers. Code coverage analysis also leads to scalability issues, especially when applied to large industrial systems. Recently, a set of novel techniques were proposed, called FAST-R, relying solely on test case code for TSM, which appeared to be much more efficient than white-box techniques. However, it achieved a comparably low fault detection capability for Java projects, thus making its application challenging in practice. This thesis presents two contributions addressing the key challenges in the TSM context, attempting to find better trade-offs between efficiency and fault detection. First, we propose ATM (AST-based Test case Minimizer), a similarity-based, search-based TSM technique, taking a specific budget as input, that also relies exclusively on the source code of test cases but attempts to achieve higher fault detection through finer-grained similarity analysis and a dedicated search algorithm. The results show that ATM achieved significantly higher fault detection rates (0.82 on average), compared to FAST-R (0.61 on average) and random minimization (0.52 on average), when running only 50% of the test cases, within practically acceptable time (1.1 − 4.3 hours, on average, per project version). To further improve the scalability of ATM, we propose LTM (Language model-based Test suite Minimization), a novel, scalable, and black-box similarity-based TSM approach based on large language models (LLMs). Experimental results show that the best configuration of LTM (UniXcoder/Cosine) outperforms ATM in three aspects: (a) achieving a slightly greater saving rate of testing time (41.72% versus 41.02%, on average); (b) attaining a significantly higher fault detection rate (0.84 versus 0.81, on average); and, most importantly, (c) minimizing test suites nearly five times faster on average, with higher gains for larger test suites and systems, thus achieving much higher scalability.
dc.identifier.uri	http://hdl.handle.net/10393/50002
dc.identifier.uri	https://doi.org/10.20381/ruor-30801
dc.language.iso	en
dc.publisher	Université d'Ottawa \| University of Ottawa
dc.rights	Attribution-NonCommercial-NoDerivatives 4.0 International	en
dc.rights.uri	http://creativecommons.org/licenses/by-nc-nd/4.0/
dc.subject	Test case minimization
dc.subject	Test suite reduction
dc.subject	Tree-based similarity
dc.subject	AST
dc.subject	Genetic algorithm
dc.subject	Black-box testing
dc.subject	Pre-trained language models
dc.title	Black-box Test Suite Minimization
dc.type	Thesis	en
thesis.degree.discipline	Génie / Engineering
thesis.degree.level	Doctoral
thesis.degree.name	PhD
uottawa.department	Science informatique et génie électrique / Electrical Engineering and Computer Science

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Pan_Rongqi_2024_thesis.pdf
Size:: 970.93 KB
Format:: Adobe Portable Document Format

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 6.65 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

- Thèses, 2011 - // Theses, 2011 -