Classification of peer-to-peer traffic using data mining techniques and IP layer attributes

Title: Classification of peer-to-peer traffic using data mining techniques and IP layer attributes
Authors: Hayajneh, Ahmad
Date: 2007
Abstract: Peer-to-Peer (P2P) is an internet application that allows a group of internet users to share their files and computing resources. P2P traffic was tremendously increased to an estimated value of 70% of broadband traffic with a special nature that directly impacts the Telecom industry. Accordingly, the Telecom business has become very interested in finding solutions to identify and control P2P traffic. This research focuses on developing a practical P2P traffic classification using data mining techniques and the information available in the TCP/IP header. We captured internet traffic, pre-processed and labeled them, and built several models using a combination of different attributes for various sizes of record files. We built the models based on neural network and decision tree techniques. Successful models were then subjected to a more stressful test using different ratios of P2P/Non-P2P in the training data set. We observed that the accuracy of the classification increases significantly when we take into account the source and destination IP addresses. We concluded that source and destination IP addresses depict information about the "community of peers". Based on this observation, we recommended that the classifier needs to be implemented within the administrative domain of the individual service provider's network, and continuously updated to ensure that new communities of peers are detected, while old communities of peers are not penalized after they stop using P2P applications. The proposed classification is based only on information in the IP layer, eliminating the privacy issues associated with deep packet inspection.
CollectionTh├Ęses, 1910 - 2010 // Theses, 1910 - 2010
MR34074.PDF5.61 MBAdobe PDFOpen