Interpretability for Deep Learning Text Classifiers

Lucaci, Diana

Interpretability for Deep Learning Text Classifiers

dc.contributor.author	Lucaci, Diana
dc.contributor.supervisor	Inkpen, Diana
dc.date.accessioned	2020-12-14T20:41:08Z
dc.date.available	2020-12-14T20:41:08Z
dc.date.issued	2020-12-14	en_US
dc.description.abstract	The ubiquitous presence of automated decision-making systems that have a performance comparable to humans brought attention towards the necessity of interpretability for the generated predictions. Whether the goal is predicting the system’s behavior when the input changes, building user trust, or expert assistance in improving the machine learning methods, interpretability is paramount when the problem is not sufficiently validated in real applications, and when unacceptable results lead to significant consequences. While for humans, there are no standard interpretations for the decisions they make, the complexity of the systems with advanced information-processing capacities conceals the detailed explanations for individual predictions, encapsulating them under layers of abstractions and complex mathematical operations. Interpretability for deep learning classifiers becomes, thus, a challenging research topic where the ambiguity of the problem statement allows for multiple exploratory paths. Our work focuses on generating natural language interpretations for individual predictions of deep learning text classifiers. We propose a framework for extracting and identifying the phrases of the training corpus that influence the prediction confidence the most through unsupervised key phrase extraction and neural predictions. We assess the contribution margin that the added justification has when the deep learning model predicts the class probability of a text instance, by introducing and defining a contribution metric that allows one to quantify the fidelity of the explanation to the model. We assess both the performance impact of the proposed approach on the classification task as quantitative analysis and the quality of the generated justifications through extensive qualitative and error analysis. This methodology manages to capture the most influencing phrases of the training corpus as explanations that reveal the linguistic features used for individual test predictions, allowing humans to predict the behavior of the deep learning classifier.	en_US
dc.identifier.uri	http://hdl.handle.net/10393/41564
dc.identifier.uri	http://dx.doi.org/10.20381/ruor-25786
dc.language.iso	en	en_US
dc.publisher	Université d'Ottawa / University of Ottawa	en_US
dc.subject	Interpretability	en_US
dc.subject	Deep Learning	en_US
dc.subject	Text Classifiers	en_US
dc.subject	Natural Language Processing	en_US
dc.subject	Text Mining	en_US
dc.title	Interpretability for Deep Learning Text Classifiers	en_US
dc.type	Thesis	en_US
thesis.degree.discipline	Génie / Engineering	en_US
thesis.degree.level	Masters	en_US
thesis.degree.name	MCS	en_US
uottawa.department	Science informatique et génie électrique / Electrical Engineering and Computer Science	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Lucaci_Diana_2020_thesis.pdf
Size:: 1.13 MB
Format:: Adobe Portable Document Format
Description:

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 6.65 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

- Thèses, 2011 - // Theses, 2011 -