Repository logo

Extending AdaBoost:Varying the Base Learners and Modifying the Weight Calculation

dc.contributor.authorNeves de Souza, Erico
dc.contributor.supervisorMatwin, Stan
dc.date.accessioned2014-05-27T13:35:31Z
dc.date.available2014-05-27T13:35:31Z
dc.date.created2014
dc.date.issued2014
dc.degree.disciplineGénie / Engineering
dc.degree.leveldoctorate
dc.degree.namePhD
dc.description.abstractAdaBoost has been considered one of the best classifiers ever developed, but two important problems have not yet been addressed. The first is the dependency on the ``weak" learner, and the second is the failure to maintain the performance of learners with small error rates (i.e. ``strong" learners). To solve the first problem, this work proposes using a different learner in each iteration - known as AdaBoost Dynamic (AD) - thereby ensuring that the performance of the algorithm is almost equal to that of the best ``weak" learner executed with AdaBoost.M1. The work then further modifies the procedure to vary the learner in each iteration, in order to locate the learner with the smallest error rate in its training data. This is done using the same weight calculation as in the original AdaBoost; this version is known as AdaBoost Dynamic with Exponential Loss (AB-EL). The results were poor, because AdaBoost does not perform well with strong learners, so, in this sense, the work confirmed previous works' results. To determine how to improve the performance, the weight calculation is modified to use the sigmoid function with algorithm output being the derivative of the same sigmoid function, rather than the logistic regression weight calculation originally used by AdaBoost; this version is known as AdaBoost Dynamic with Logistic Loss (AB-DL). This work presents the convergence proof that binomial weight calculation works, and that this approach improves the results for the strong learner, both theoretically and empirically. AB-DL also has some disadvantages, like the search for the ``best" classifier and that this search reduces the diversity among the classifiers. In order to attack these issues, another algorithm is proposed that combines AD ``weak" leaner execution policy with a small modification of AB-DL's weight calculation, called AdaBoost Dynamic with Added Cost (AD-AC). AD-AC also has a theoretical upper bound error, and the algorithm offers a small accuracy improvement when compared with AB-DL, and traditional AdaBoost approaches. Lastly, this work also adapts AD-AC's weight calculation approach to deal with data stream problem, where classifiers must deal with very large data sets (in the order of millions of instances), and limited memory availability.
dc.embargo.termsimmediate
dc.faculty.departmentScience informatique et génie électrique / Electrical Engineering and Computer Science
dc.identifier.urihttp://hdl.handle.net/10393/31146
dc.identifier.urihttp://dx.doi.org/10.20381/ruor-3756
dc.language.isoen
dc.publisherUniversité d'Ottawa / University of Ottawa
dc.subjectAdaBoost
dc.subjectMachine Learning
dc.subjectData Stream
dc.titleExtending AdaBoost:Varying the Base Learners and Modifying the Weight Calculation
dc.typeThesis
thesis.degree.disciplineGénie / Engineering
thesis.degree.levelDoctoral
thesis.degree.namePhD
uottawa.departmentScience informatique et génie électrique / Electrical Engineering and Computer Science

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail ImageThumbnail Image
Name:
Neves_De_Souza_Erico_2014_thesis.pdf
Size:
1.3 MB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail ImageThumbnail Image
Name:
license.txt
Size:
4.21 KB
Format:
Item-specific license agreed upon to submission
Description: