A Language-Model-Based Approach for Detecting Incompleteness in Natural-Language Requirements

Luitel, Dipeeka

A Language-Model-Based Approach for Detecting Incompleteness in Natural-Language Requirements

dc.contributor.author	Luitel, Dipeeka
dc.contributor.supervisor	Sabetzadeh, Mehrdad
dc.date.accessioned	2023-05-24T17:42:30Z
dc.date.available	2023-05-24T17:42:30Z
dc.date.issued	2023-05-24	en_US
dc.description.abstract	[Context and motivation]: Incompleteness in natural-language requirements is a challenging problem. [Question/Problem]: A common technique for detecting incompleteness in requirements is checking the requirements against external sources. With the emergence of language models such as BERT, an interesting question is whether language models are useful external sources for finding potential incompleteness in requirements. [Principal ideas/results]: We mask words in requirements and have BERT's masked language model (MLM) generate contextualized predictions for filling the masked slots. We simulate incompleteness by withholding content from requirements and measure BERT's ability to predict terminology that is present in the withheld content but absent in the content disclosed to BERT. [Contributions]: BERT can be configured to generate multiple predictions per mask. Our first contribution is to determine how many predictions per mask is an optimal trade-off between effectively discovering omissions in requirements and the level of noise in the predictions. Our second contribution is devising a machine learning-based filter that post-processes predictions made by BERT to further reduce noise. We empirically evaluate our solution over 40 requirements specifications drawn from the PURE dataset [30]. Our results indicate that: (1) predictions made by BERT are highly effective at pinpointing terminology that is missing from requirements, and (2) our filter can substantially reduce noise from the predictions, thus making BERT a more compelling aid for improving completeness in requirements.	en_US
dc.identifier.uri	http://hdl.handle.net/10393/44990
dc.identifier.uri	http://dx.doi.org/10.20381/ruor-29196
dc.language.iso	en	en_US
dc.publisher	Université d'Ottawa / University of Ottawa	en_US
dc.rights	Attribution 4.0 International	*
dc.rights.uri	http://creativecommons.org/licenses/by/4.0/	*
dc.subject	BERT	en_US
dc.subject	Natural Language Processing	en_US
dc.subject	Machine Learning	en_US
dc.subject	Language Models	en_US
dc.title	A Language-Model-Based Approach for Detecting Incompleteness in Natural-Language Requirements	en_US
dc.type	Thesis	en_US
thesis.degree.discipline	Génie / Engineering	en_US
thesis.degree.level	Masters	en_US
thesis.degree.name	MCS	en_US
uottawa.department	Science informatique et génie électrique / Electrical Engineering and Computer Science	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Luitel_Dipeeka_2023_thesis.pdf
Size:: 4.45 MB
Format:: Adobe Portable Document Format
Description:

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 6.65 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

- Thèses, 2011 - // Theses, 2011 -