Repository logo

Black-box Test Suite Minimization

dc.contributor.authorPan, Rongqi
dc.contributor.supervisorBriand, Lionel
dc.date.accessioned2024-12-19T21:31:23Z
dc.date.available2024-12-19T21:31:23Z
dc.date.issued2024-12-19
dc.description.abstractIn software testing, executing large test suites is time and resource-consuming, sometimes impossible, and such test suites typically contain many redundant test cases that (mostly) find the same faults. Hence, test suite minimization (TSM) is used to remove redundant test cases that are unlikely to detect new faults. However, most TSM techniques rely on code coverage (white-box), model-based features, or requirements specifications, which are not always (entirely) accessible by test engineers. Code coverage analysis also leads to scalability issues, especially when applied to large industrial systems. Recently, a set of novel techniques were proposed, called FAST-R, relying solely on test case code for TSM, which appeared to be much more efficient than white-box techniques. However, it achieved a comparably low fault detection capability for Java projects, thus making its application challenging in practice. This thesis presents two contributions addressing the key challenges in the TSM context, attempting to find better trade-offs between efficiency and fault detection. First, we propose ATM (AST-based Test case Minimizer), a similarity-based, search-based TSM technique, taking a specific budget as input, that also relies exclusively on the source code of test cases but attempts to achieve higher fault detection through finer-grained similarity analysis and a dedicated search algorithm. The results show that ATM achieved significantly higher fault detection rates (0.82 on average), compared to FAST-R (0.61 on average) and random minimization (0.52 on average), when running only 50% of the test cases, within practically acceptable time (1.1 − 4.3 hours, on average, per project version). To further improve the scalability of ATM, we propose LTM (Language model-based Test suite Minimization), a novel, scalable, and black-box similarity-based TSM approach based on large language models (LLMs). Experimental results show that the best configuration of LTM (UniXcoder/Cosine) outperforms ATM in three aspects: (a) achieving a slightly greater saving rate of testing time (41.72% versus 41.02%, on average); (b) attaining a significantly higher fault detection rate (0.84 versus 0.81, on average); and, most importantly, (c) minimizing test suites nearly five times faster on average, with higher gains for larger test suites and systems, thus achieving much higher scalability.
dc.identifier.urihttp://hdl.handle.net/10393/50002
dc.identifier.urihttps://doi.org/10.20381/ruor-30801
dc.language.isoen
dc.publisherUniversité d'Ottawa | University of Ottawa
dc.rightsAttribution-NonCommercial-NoDerivatives 4.0 Internationalen
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/4.0/
dc.subjectTest case minimization
dc.subjectTest suite reduction
dc.subjectTree-based similarity
dc.subjectAST
dc.subjectGenetic algorithm
dc.subjectBlack-box testing
dc.subjectPre-trained language models
dc.titleBlack-box Test Suite Minimization
dc.typeThesisen
thesis.degree.disciplineGénie / Engineering
thesis.degree.levelDoctoral
thesis.degree.namePhD
uottawa.departmentScience informatique et génie électrique / Electrical Engineering and Computer Science

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail ImageThumbnail Image
Name:
Pan_Rongqi_2024_thesis.pdf
Size:
970.93 KB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail ImageThumbnail Image
Name:
license.txt
Size:
6.65 KB
Format:
Item-specific license agreed upon to submission
Description: