Repository logo

Automatic Poetry Classification Using Natural Language Processing

dc.contributor.authorKesarwani, Vaibhav
dc.contributor.supervisorInkpen, Diana
dc.contributor.supervisorTanasescu, Chris
dc.date.accessioned2018-03-13T19:20:38Z
dc.date.available2018-03-13T19:20:38Z
dc.date.issued2018
dc.description.abstractPoetry, as a special form of literature, is crucial for computational linguistics. It has a high density of emotions, figures of speech, vividness, creativity, and ambiguity. Poetry poses a much greater challenge for the application of Natural Language Processing algorithms than any other literary genre. Our system establishes a computational model that classifies poems based on similarity features like rhyme, diction, and metaphor. For rhyme analysis, we investigate the methods used to classify poems based on rhyme patterns. First, the overview of different types of rhymes is given along with the detailed description of detecting rhyme type and sub-types by the application of a pronunciation dictionary on our poetry dataset. We achieve an accuracy of 96.51% in identifying rhymes in poetry by applying a phonetic similarity model. Then we achieve a rhyme quantification metric RhymeScore based on the matching phonetic transcription of each poem. We also develop an application for the visualization of this quantified RhymeScore as a scatter plot in 2 or 3 dimensions. For diction analysis, we investigate the methods used to classify poems based on diction. First the linguistic quantitative and semantic features that constitute diction are enumerated. Then we investigate the methodology used to compute these features from our poetry dataset. We also build a word embeddings model on our poetry dataset with 1.5 million words in 100 dimensions and do a comparative analysis with GloVe embeddings. Metaphor is a part of diction, but as it is a very complex topic in its own right, we address it as a stand-alone issue and develop several methods for it. Previous work on metaphor detection relies on either rule-based or statistical models, none of them applied to poetry. Our methods focus on metaphor detection in a poetry corpus, but we test on non-poetry data as well. We combine rule-based and statistical models (word embeddings) to develop a new classification system. Our first metaphor detection method achieves a precision of 0.759 and a recall of 0.804 in identifying one type of metaphor in poetry, by using a Support Vector Machine classifier with various types of features. Furthermore, our deep learning model based on a Convolutional Neural Network achieves a precision of 0.831 and a recall of 0.836 for the same task. We also develop an application for generic metaphor detection in any type of natural text.en
dc.identifier.urihttp://hdl.handle.net/10393/37309
dc.identifier.urihttp://dx.doi.org/10.20381/ruor-21581
dc.language.isoenen
dc.publisherUniversité d'Ottawa / University of Ottawaen
dc.subjectNatural language processingen
dc.subjectMachine learningen
dc.subjectDeep learningen
dc.subjectMetaphoren
dc.subjectPoetryen
dc.subjectRhymeen
dc.subjectWord embeddingsen
dc.subjectDictionen
dc.subjectComputational poetryen
dc.subjectPoetry classificationen
dc.titleAutomatic Poetry Classification Using Natural Language Processingen
dc.typeThesisen
thesis.degree.disciplineGénie / Engineeringen
thesis.degree.levelMastersen
thesis.degree.nameMCSen
uottawa.departmentScience informatique et génie électrique / Electrical Engineering and Computer Scienceen

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail ImageThumbnail Image
Name:
Kesarwani_Vaibhav_2018_thesis.pdf
Size:
1.99 MB
Format:
Adobe Portable Document Format
Description:

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail ImageThumbnail Image
Name:
license.txt
Size:
6.65 KB
Format:
Item-specific license agreed upon to submission
Description: