Comparing Encoder-Decoder Architectures for Neural Machine Translation: A Challenge Set Approach

Doan, Coraline

Comparing Encoder-Decoder Architectures for Neural Machine Translation: A Challenge Set Approach

dc.contributor.author	Doan, Coraline
dc.contributor.supervisor	Marshman, Elizabeth
dc.date.accessioned	2021-11-19T16:24:55Z
dc.date.available	2021-11-19T16:24:55Z
dc.date.issued	2021-11-19	en_US
dc.description.abstract	Machine translation (MT) as a field of research has known significant advances in recent years, with the increased interest for neural machine translation (NMT). By combining deep learning with translation, researchers have been able to deliver systems that perform better than most, if not all, of their predecessors. While the general consensus regarding NMT is that it renders higher-quality translations that are overall more idiomatic, researchers recognize that NMT systems still struggle to deal with certain classic difficulties, and that their performance may vary depending on their architecture. In this project, we implement a challenge-set based approach to the evaluation of examples of three main NMT architectures: convolutional neural network-based systems (CNN), recurrent neural network-based (RNN) systems, and attention-based systems, trained on the same data set for English to French translation. The challenge set focuses on a selection of lexical and syntactic difficulties (e.g., ambiguities) drawn from literature on human translation, machine translation, and writing for translation, and also includes variations in sentence lengths and structures that are recognized as sources of difficulties even for NMT systems. This set allows us to evaluate performance in multiple areas of difficulty for the systems overall, as well as to evaluate any differences between architectures’ performance. Through our challenge set, we found that our CNN-based system tends to reword sentences, sometimes shifting their meaning, while our RNN-based system seems to perform better when provided with a larger context, and our attention-based system seems to struggle the longer a sentence becomes.	en_US
dc.identifier.uri	http://hdl.handle.net/10393/42936
dc.identifier.uri	http://dx.doi.org/10.20381/ruor-27153
dc.language.iso	en	en_US
dc.publisher	Université d'Ottawa / University of Ottawa	en_US
dc.rights	Attribution-NonCommercial 4.0 International	*
dc.rights.uri	http://creativecommons.org/licenses/by-nc/4.0/	*
dc.subject	neural machine translation	en_US
dc.subject	machine translation evaluation	en_US
dc.subject	convolutional neural network	en_US
dc.subject	recurrent neural network	en_US
dc.subject	attention-based neural machine translation	en_US
dc.subject	challenge set	en_US
dc.title	Comparing Encoder-Decoder Architectures for Neural Machine Translation: A Challenge Set Approach	en_US
dc.type	Thesis	en_US
thesis.degree.discipline	Arts	en_US
thesis.degree.level	Masters	en_US
thesis.degree.name	MA	en_US
uottawa.department	Traduction et interprétation / Translation and Interpretation	en_US

Fichiers

Trousse originale

Voici les éléments 1 - 1 sur 1

Nom:: Doan_Coraline_2021_thesis.pdf
Taille:: 4.73 MB
Format:: Adobe Portable Document Format
Description:

Télécharger

Trousse de licence

Voici les éléments 1 - 1 sur 1

Nom:: license.txt
Taille:: 6.65 KB
Format:: Item-specific license agreed upon to submission
Description:

Télécharger

Collections

- Thèses, 2011 - // Theses, 2011 -