Improving P2P Network Traffic Classification with ML multi-classifiers
Citation
Haitham A. Jamil , Bushra M. A , Ahmed Abdalla , Ban M. K , Sulaiman M. Nor , Muhammad N. Marsono."Improving P2P Network Traffic Classification with ML multi-classifiers". International Journal of P2P Network Trends and Technology (IJPTT), V4(2):60-65 Mar - Apr 2014, ISSN:2249-2615, www.ijpttjournal.org, Published by Seventh Sense Research Group.
Abstract
Machine learning (ML) techniques have been known to be a promising method to classify Internet traffic. These techniques have the capability to detect encrypted communication and unknown traffic. However, the generation of examples, feature selection and classifier design have a significant impact in classification results. This paper proposes approach based on multiple ML classifiers in order to provide a robust model for online P2P Internet traffic classification. The process of validation and analysis were done through experimentation on traces captured from Universiti Technologi Malaysia. The results show that the generation of the training model using ML on P2P classification resulted in a high accuracy, low false negative and low classifying time.
References
[1] H. A. Jamil, R. Zarei, N. O. Fadlelssied, A. M, S. M. Nor, and M. N. Marsono, "Analysis of Features Selection for P2P Traffic Detection Using Support Vector Machine," presented at the International Conference of Information and Communication Technology (ICoICT), Bandung, Indonesia, 2013.
[2] H. A. Jamil, Bushra M. Ali, Ghada A. A., Sulaiman M. Nor, and M. N. Marsono, "Detection and Mitigation Framework of Peer-to-Peer Traffic in Campus Networks," International Review on Computers and Software (IRECOS), vol. 8, pp. 1734-1743, 31 July 2013 2013.
[3] H. A. Jamil, A. M, A. Hamza, S. M. Nor, and M. N. Marsono, "Selection of online Features for Peer-to-Peer Network Traffic Classification," in Recent Advances in Intelligent Informatics. vol. 235, ed: Springer International Publishing, 2014, pp. 379-390.
[4] T. T. Nguyen and G. Armitage, "A survey of techniques for internet traffic classification using machine learning," Communications Surveys & Tutorials, IEEE, vol. 10, pp. 56-76, 2008.
[5] I. H. Witten and E. Frank, Data Mining: Practical machine learning tools and techniques: Morgan Kaufmann, 2005.
[6] J. Frank, "Artificial intelligence and intrusion detection: Current and future directions," in Proceedings of the 17th National Computer Security Conference, 1994.
[7] S. Wang, S. Schlobach, and M. Klein, "Concept drift and how to identify it," Web Semantics: Science, Services and Agents on the World Wide Web, vol. 9, pp. 247-265, 2011.
[8] P. M. Gonçalves Jr and R. S. M. d. Barros, "RCD: A recurring concept drift framework," Pattern Recognition Letters, vol. 34, pp. 1018-1025, 2013.
[9] L. L. Minku, A. P. White, and X. Yao, "The impact of diversity on online ensemble learning in the presence of concept drift," Knowledge and Data Engineering, IEEE Transactions on, vol. 22, pp. 730-742, 2010.
[10] N. Lu, G. Zhang, and J. Lu, "Concept drift detection via competence models," Artificial intelligence, 2014.
[11] L. Bernaille, R. Teixeira, I. Akodkenou, A. Soule, and K. Salamatian, "Traffic classification on the fly," ACM SIGCOMM Computer Communication Review, vol. 36, pp. 23-26, 2006.
[12] R. Zarei, A. Monemi, and M. Marsono, "Retraining Mechanism for On-Line Peer-to-Peer Traffic Classification," in Intelligent Informatics vol. 182, ed: Springer Berlin Heidelberg, 2013, pp. 373-382.
[13] A. W. Moore and D. Zuev, "Internet traffic classification using bayesian analysis techniques," 2005, pp. 50-60.
[14] N. Williams, S. Zander, and G. Armitage, "A preliminary performance comparison of five machine learning algorithms for practical IP traffic flow classification," ACM SIGCOMM Computer Communication Review, vol. 36, pp. 5-16, 2006.
[15] T. Auld, A. W. Moore, and S. F. Gull, "Bayesian neural networks for internet traffic classification," Neural Networks, IEEE Transactions on, vol. 18, pp. 223-239, 2007.
[16] Y. Ma, Z. Qian, G. Shou, and Y. Hu, "Study of information network traffic identification based on C4. 5 algorithm," 2008, pp. 1-5.
[17] J. Erman, M. Arlitt, and A. Mahanti, "Traffic classification using clustering algorithms," in ACM SIGCOMM 2006 - Conference on Applications, Technologies, Architectures, and Protocols for Computer Communication, September 11, 2006 - September 15, 2006, Pisa, Italy, 2006, pp. 281-286.
[18] R. Zarei, A. Monemi, and M. N. Marsono, "Automated Dataset Generation for Training Peer-to-Peer Machine Learning Classifiers," Journal of Network and Systems Management, pp. 1-22, 2013.
[19] WEKA. (2012). Available: http://www.cs.waikato.ac.nz/ml/weka/
[20] SVM. (2012). Support vector machines (SVM). Available: http://www.support-vector-machines.org
[21] R. Wang, Y. Liu, Y. Yang, and H. Wang, "A new method for P2P traffic identification based on support vector machine," Artificial Intelligence Markup Language. Egypt: IEEE Computer Society, pp. 58-63, 2006.
[22] A. Nogueira, P. Salvador, A. Couto, and R. Valadas, "Towards the On-line Identification of Peer-to-peer Flow Patterns," Journal of Networks, vol. 4, 2009.
[23] A. Dainotti, A. Pescapè, and K. C. Claffy, "Issues and future directions in traffic classification," Network, IEEE, vol. 26, pp. 35-40, 2012.
[24] A. Callado, J. Kelner, D. Sadok, C. Alberto Kamienski, and S. Fernandes, "Better network traffic identification through the independent combination of techniques," Journal of Network and Computer Applications, vol. 33, pp. 433-446, 2010.
[25] D. Park, "Real-time classification of Internet application traffic using a hierarchical multi-class SVM," KSII Transactions on Internet and Information Systems (TIIS), vol. 4, pp. 859-876, 2010.
[26] Wireshark. (2010). Available: http://www.wireshark.org
[27] Snort. (2010). SNORT Network Intrusion Detection System. Available: http://www.snort.org
[28] CAIDA. (2013, 10 April 2013). The Cooperative Association for Internet Data Analysis. Available: http://www.caida.org/data
[29] Cambridge, data, and sets. (2012, 18 nov). Available: http://www.cl.cam.ac.uk/research/srg/netos/nprobe/data/papers/sigmetrics/index.html
[30] H. L. Zhang, G. Lu, M. T. Qassrawi, Y. Zhang, and X. Z. Yu, "Feature selection for optimizing traffic classification," Computer Communications, vol. 35, pp. 1457-1471, Jul 1 2012.
[31] UNIBS. (2012, 19 Nov). Università Brescia data sets. Available: http://www.ing.unibs.it/ntw/tools/traces/download/
Keywords
P2P traffic, traffic classification, Machine Learning, Multi-classifiers.