A Sequence to Image Transformation Technique for Anomaly Detection in Drifting Data Streams: Detection, Segmentation and Generative Models on Multiple Objects

Ryan, Sid

A Sequence to Image Transformation Technique for Anomaly Detection in Drifting Data Streams: Detection, Segmentation and Generative Models on Multiple Objects

dc.contributor.author	Ryan, Sid
dc.contributor.supervisor	Japkowicz, Nathalie
dc.contributor.supervisor	Kiringa, Iluju
dc.date.accessioned	2021-02-17T20:39:16Z
dc.date.available	2021-02-17T20:39:16Z
dc.date.issued	2021-02-17	en_US
dc.description.abstract	In many real-world applications, the characteristics of data change over time. This behavior is known as concept drift. Maintaining optimal algorithms and their hyperparameters in such applications becomes cumbersome, as models become outdated very quickly. Although the data often consists of one-dimensional streams (e.g. collected by activity logs, sensors and mobile devices), in a higher level the aggregated sources produce multiple streams. Machine learning, therefore, requires univariate and multivariate analysis of long term dependencies to create valuable insights. In this thesis, we assess hundreds of combinations of data characteristics and methods in sequential data. Particularly we use real-life anomalous instances in the network traffic domain and to increase complexity we combine it with synthesized drifting data. From our preliminary evaluation of conventional machine learning, meta-learning and deep learning methods and comparing their generalization performance in the presence of concept drift, the results show that deep learning outperforms all other tested methods. Although, one-dimensional Convolutional Neural Networks (1D-CNN) produced the highest performance in image classification, similar to other models, they are able to label if sliding windows are anomalous or not. However, in majority of real-life applications, it is crucial to find individual instances that resulted in an anomalous pattern. Therefore, we introduce a method to transform the representation of the data to tensors of two dimensional images, enabling modern deep learning methods to become directly applicable to sequential data. We propose Sequential Mask Convolutional Neural Network (SMCNN) pinpoints the location of anomalous patterns. SMCNN model transforms sequential data by means of a specialized filter that produces flexible shape forms and detects multiple types of outliers simultaneously. In addition, to solve the issue of high ratio of False Positive in the unsupervised Generative Adversarial Networks (GAN) in concept drifts, we introduce a method for finding optimal sliding windows that automatically removes normal repetitive patterns. We introduce DriftGAN architecture that discriminates between normal and anomalous patterns. Our SMCNN and DriftGAN methods significantly outperform prior endeavours and provide high generalization capabilities on a wide array of one-dimensional data characteristics with repetitive nature.	en_US
dc.identifier.uri	http://hdl.handle.net/10393/41788
dc.identifier.uri	http://dx.doi.org/10.20381/ruor-26010
dc.language.iso	en	en_US
dc.publisher	Université d'Ottawa / University of Ottawa	en_US
dc.subject	Machine Learning	en_US
dc.subject	Deep Learning	en_US
dc.subject	Convolutional Neural Networks	en_US
dc.subject	Generative Adversarial Networks	en_US
dc.subject	Anomaly Detection	en_US
dc.subject	Image Transformation	en_US
dc.subject	Concept Drift	en_US
dc.subject	Data Stream	en_US
dc.title	A Sequence to Image Transformation Technique for Anomaly Detection in Drifting Data Streams: Detection, Segmentation and Generative Models on Multiple Objects	en_US
dc.type	Thesis	en_US
thesis.degree.discipline	Génie / Engineering	en_US
thesis.degree.level	Doctoral	en_US
thesis.degree.name	PhD	en_US
uottawa.department	Science informatique et génie électrique / Electrical Engineering and Computer Science	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Ryan_Sid_2021_thesis.pdf
Size:: 18.8 MB
Format:: Adobe Portable Document Format
Description:

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 6.65 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

- Thèses, 2011 - // Theses, 2011 -