Repository logo

Multiple classifier combination through ensembles and data generation

dc.contributor.authorGuo, Hong Yu
dc.date.accessioned2013-11-07T17:25:27Z
dc.date.available2013-11-07T17:25:27Z
dc.date.created2004
dc.date.issued2004
dc.degree.levelMasters
dc.degree.nameM.Sc.
dc.description.abstractThis thesis introduces new approaches, namely the DataBoost and DataBoost-IM algorithms, to extend Boosting algorithms' predictive performance. The DataBoost algorithm is designed to assist Boosting algorithms to avoid over-emphasizing hard examples. In the DataBoost algorithm, new synthetic data with bias information towards hard examples are added to the original training set when training the component classifiers. The DataBoost approach was evaluated against ten data sets, using both decision trees and neural networks as base classifiers. The experiments show promising results, in terms of overall accuracy when compared to a standard benchmarking Boosting algorithm. The DataBoost-IM algorithm is developed to learn from two-class imbalanced data sets. In the DataBoost-IM approach, the class frequencies and the total weights against different classes within the ensemble's training set are rebalanced by adding new synthetic data. The DataBoost-IM method was evaluated, in terms of the F-measures, G-mean and overall accuracy, against seventeen highly and moderately imbalanced data sets using decision trees as base classifiers. (Abstract shortened by UMI.)
dc.format.extent114 p.
dc.identifier.citationSource: Masters Abstracts International, Volume: 43-06, page: 2406.
dc.identifier.urihttp://hdl.handle.net/10393/26648
dc.identifier.urihttp://dx.doi.org/10.20381/ruor-9729
dc.language.isoen
dc.publisherUniversity of Ottawa (Canada)
dc.subject.classificationEngineering, System Science.
dc.titleMultiple classifier combination through ensembles and data generation
dc.typeThesis

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail ImageThumbnail Image
Name:
MR01483.PDF
Size:
4.69 MB
Format:
Adobe Portable Document Format