Repository logo

Learning in the Presence of Skew and Missing Labels Through Online Ensembles and Meta-reinforcement Learning

dc.contributor.authorVafaie, Parsa
dc.contributor.supervisorViktor, Herna
dc.contributor.supervisorMichalowski, Wojtek
dc.date.accessioned2021-09-07T18:09:10Z
dc.date.available2021-09-07T18:09:10Z
dc.date.issued2021-09-07en_US
dc.description.abstractData streams are large sequences of data, possibly endless and temporarily ordered, that are common-place in Internet of Things (IoT) applications such as intrusion detection in computer networking, fraud detection in financial institutions, real-time tumor tracking in radiotherapy and social media analysis. Algorithms learning from such streams need to be able to construct near real-time models that continuously adapt to potential changes in patterns, in order to retain high performance throughout the stream. It follows that there are numerous challenges involved in supervised learning (or so-called classification) in such environments. One of the challenges in learning from streams is multi-class imbalance, in which the rates of instances in the different class labels differ substantially. Notably, classification algorithms may become biased towards the classes with more frequent instances, sacrificing the performance of the less frequent or so-called minority classes. Further, minority instances often arrive infrequently and in bursts, making accurate model construction problematic. For example, network intrusion detection systems must be able to distinguish between normal traffic and multiple minority classes corresponding to a variety of different types of attacks. Further, having labels for all instances are often infeasible, since we might have missing or late-arriving labels. For instance, when learning from a stream regarding the task of detecting network intrusions, the true label for all instances might not be available, or it might take time until the label is made available, especially for new types of attacks. In this thesis, we contribute to the advancements of online learning from evolving streams by focusing on the above-mentioned areas of multi-class imbalance and missing labels. First, we introduce a multi-class online ensemble algorithm designed to maintain a balanced performance over all classes. Specifically, our approach samples instances with replacement while dynamically increasing the weights of under-represented classes, in order to produce models that benefit all classes. Our experimental results show that our online ensemble method performs well against multi-class imbalanced data in various datasets. We further continue our study by introducing an approach to dealing with missing labels that utilize both labelled and unlabelled data to increase a model’s performance. That is, our method utilizes labelled data for pseudo-labelling unlabelled instances, allowing the model to perform better in environments where labels are scarce. More specifically, our approach features a meta-reinforcement learning agent, trained on multiple-source streams, that can effectively select the prediction of a K nearest neighbours (K-NN) classifier as the label for unlabelled instances. Extensive experiments on benchmark datasets demonstrate the value and effectiveness of our approach and confirm that our method outperforms state-of-the-art.en_US
dc.identifier.urihttp://hdl.handle.net/10393/42636
dc.identifier.urihttp://dx.doi.org/10.20381/ruor-26856
dc.language.isoenen_US
dc.publisherUniversité d'Ottawa / University of Ottawaen_US
dc.rightsAttribution-NoDerivatives 4.0 International*
dc.rights.urihttp://creativecommons.org/licenses/by-nd/4.0/*
dc.subjectMachine learningen_US
dc.subjectData streamsen_US
dc.subjectImbalanced learningen_US
dc.subjectSemi-supervised learningen_US
dc.subjectMeta-learningen_US
dc.titleLearning in the Presence of Skew and Missing Labels Through Online Ensembles and Meta-reinforcement Learningen_US
dc.typeThesisen_US
thesis.degree.disciplineGénie / Engineeringen_US
thesis.degree.levelMastersen_US
thesis.degree.nameMCSen_US
uottawa.departmentScience informatique et génie électrique / Electrical Engineering and Computer Scienceen_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail ImageThumbnail Image
Name:
Vafaie_Parsa_2021_thesis.pdf
Size:
2.44 MB
Format:
Adobe Portable Document Format
Description:

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail ImageThumbnail Image
Name:
license.txt
Size:
6.65 KB
Format:
Item-specific license agreed upon to submission
Description: