Repository logo

Enhancing Text Readability Using Deep Learning Techniques

dc.contributor.authorAlkaldi, Wejdan
dc.contributor.supervisorInkpen, Diana
dc.date.accessioned2022-07-20T20:45:28Z
dc.date.available2022-07-20T20:45:28Z
dc.date.issued2022-07-20en_US
dc.description.abstractIn the information era, reading becomes more important to keep up with the growing amount of knowledge. The ability to read a document varies from person to person depending on their skills and knowledge. It also depends on the readability level of the text, whether it matches the reader’s level or not. In this thesis, we propose a system that uses state-of-the-art technology in machine learning and deep learning to classify and simplify a text taking into consideration the reader’s level of reading. The system classifies any text to its equivalent readability level. If the text readability level is higher than the reader’s level, i.e. too difficult to read, the system performs text simplification to meet the desired readability level. The classification and simplification models are trained on data annotated with readability levels from in the Newsela corpus. The trained simplification model performs at sentence level, to simplify a given text to match a specific readability level. Moreover, the trained classification model is used to classify more unlabelled sentences using Wikipedia Corpus and Mechanical Turk Corpus in order to enrich the text simplification dataset. The augmented dataset is then used to improve the quality of the simplified sentences. The system generates simplified versions of a text based on the desired readability levels. This can help people with low literacy to read and understand any documents they need. It can also be beneficial to educators who assist readers with different reading levels.en_US
dc.identifier.urihttp://hdl.handle.net/10393/43831
dc.identifier.urihttp://dx.doi.org/10.20381/ruor-28045
dc.language.isoenen_US
dc.publisherUniversité d'Ottawa / University of Ottawaen_US
dc.rightsAttribution 4.0 International*
dc.rights.urihttp://creativecommons.org/licenses/by/4.0/*
dc.subjectNLPen_US
dc.subjectText Simplificationen_US
dc.subjectText Classificationen_US
dc.subjectDeep Learningen_US
dc.subjectReinforcement Learningen_US
dc.subjectData Augmentationen_US
dc.subjectNatural Language Processingen_US
dc.titleEnhancing Text Readability Using Deep Learning Techniquesen_US
dc.typeThesisen_US
thesis.degree.disciplineGénie / Engineeringen_US
thesis.degree.levelDoctoralen_US
thesis.degree.namePhDen_US
uottawa.departmentScience informatique et génie électrique / Electrical Engineering and Computer Scienceen_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail ImageThumbnail Image
Name:
Alkaldi_Wejdan_2022_thesis.pdf
Size:
1.42 MB
Format:
Adobe Portable Document Format
Description:

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail ImageThumbnail Image
Name:
license.txt
Size:
6.65 KB
Format:
Item-specific license agreed upon to submission
Description: