Compact features for sentiment analysis
En cours de chargement...
Fichiers
Date
Authors
Nom de la revue
ISSN de la revue
Titre du volume
Éditeur
University of Ottawa (Canada)
Résumé
This work examines a novel method of developing features to use for machine learning of sentiment analysis and related tasks. This task is frequently approached using a Bag of Words representation -- one feature for each word encountered in the training data -- which can easily number in the thousands or tens of thousands. This thesis develops a set of "numeric" features, by learning scores for words, dividing the range of possible scores into a number of bins, and then generating features based on counting how many words in each document have scores in each bin. This allows for effective learning of sentiment and related tasks with 25 features; in fact, performance was very often slightly better with these features. This reduction in the number of features allows for the processing of much larger collections of texts than previously attempted. In addition, we carefully consider the problem of evaluating ordinal problems.
Description
Mots-clés
Citation
Source: Masters Abstracts International, Volume: 48-06, page: 3709.
