Repository logo

A Sequence to Image Transformation Technique for Anomaly Detection in Drifting Data Streams: Detection, Segmentation and Generative Models on Multiple Objects

dc.contributor.authorRyan, Sid
dc.contributor.supervisorJapkowicz, Nathalie
dc.contributor.supervisorKiringa, Iluju
dc.date.accessioned2021-02-17T20:39:16Z
dc.date.available2021-02-17T20:39:16Z
dc.date.issued2021-02-17en_US
dc.description.abstractIn many real-world applications, the characteristics of data change over time. This behavior is known as concept drift. Maintaining optimal algorithms and their hyperparameters in such applications becomes cumbersome, as models become outdated very quickly. Although the data often consists of one-dimensional streams (e.g. collected by activity logs, sensors and mobile devices), in a higher level the aggregated sources produce multiple streams. Machine learning, therefore, requires univariate and multivariate analysis of long term dependencies to create valuable insights. In this thesis, we assess hundreds of combinations of data characteristics and methods in sequential data. Particularly we use real-life anomalous instances in the network traffic domain and to increase complexity we combine it with synthesized drifting data. From our preliminary evaluation of conventional machine learning, meta-learning and deep learning methods and comparing their generalization performance in the presence of concept drift, the results show that deep learning outperforms all other tested methods. Although, one-dimensional Convolutional Neural Networks (1D-CNN) produced the highest performance in image classification, similar to other models, they are able to label if sliding windows are anomalous or not. However, in majority of real-life applications, it is crucial to find individual instances that resulted in an anomalous pattern. Therefore, we introduce a method to transform the representation of the data to tensors of two dimensional images, enabling modern deep learning methods to become directly applicable to sequential data. We propose Sequential Mask Convolutional Neural Network (SMCNN) pinpoints the location of anomalous patterns. SMCNN model transforms sequential data by means of a specialized filter that produces flexible shape forms and detects multiple types of outliers simultaneously. In addition, to solve the issue of high ratio of False Positive in the unsupervised Generative Adversarial Networks (GAN) in concept drifts, we introduce a method for finding optimal sliding windows that automatically removes normal repetitive patterns. We introduce DriftGAN architecture that discriminates between normal and anomalous patterns. Our SMCNN and DriftGAN methods significantly outperform prior endeavours and provide high generalization capabilities on a wide array of one-dimensional data characteristics with repetitive nature.en_US
dc.identifier.urihttp://hdl.handle.net/10393/41788
dc.identifier.urihttp://dx.doi.org/10.20381/ruor-26010
dc.language.isoenen_US
dc.publisherUniversité d'Ottawa / University of Ottawaen_US
dc.subjectMachine Learningen_US
dc.subjectDeep Learningen_US
dc.subjectConvolutional Neural Networksen_US
dc.subjectGenerative Adversarial Networksen_US
dc.subjectAnomaly Detectionen_US
dc.subjectImage Transformationen_US
dc.subjectConcept Driften_US
dc.subjectData Streamen_US
dc.titleA Sequence to Image Transformation Technique for Anomaly Detection in Drifting Data Streams: Detection, Segmentation and Generative Models on Multiple Objectsen_US
dc.typeThesisen_US
thesis.degree.disciplineGénie / Engineeringen_US
thesis.degree.levelDoctoralen_US
thesis.degree.namePhDen_US
uottawa.departmentScience informatique et génie électrique / Electrical Engineering and Computer Scienceen_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail ImageThumbnail Image
Name:
Ryan_Sid_2021_thesis.pdf
Size:
18.8 MB
Format:
Adobe Portable Document Format
Description:

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail ImageThumbnail Image
Name:
license.txt
Size:
6.65 KB
Format:
Item-specific license agreed upon to submission
Description: