A Survey of Feature Extraction and Feature Selection Techniques used in Machine Learning-Based Botnet Detection Schemes

Akinyemi Moruff Oyelakin, Jimoh Rasheed G


Machine learning techniques have been widely used for the classification of botnets as they have been argued to have improved strengths compared to signature-based approaches. The level of performances of some of these detection schemes have been traced to the relevance of the features used for the classification models.  Therefore, extraction and selection of the most discriminative features in the classification of botnets is an important research area. It has equally been found that when a Machine-learning based approach is being used to identify botnets, the dataset chosen has to be real and representative. Feature extraction and Feature Selection are necessary steps prior to using a Machine Learning-based classification algorithm for identifying botnets.  The reason for the pre-processing and feature selection steps in Machine Learning-based model is to be able to remove irrelevant and redundant data in the experimental dataset, minimize computational complexity, and increase both model simplicity as well as accuracy. This paper provided a survey of various feature extraction and feature selection methods that have been used by researchers that proposed Machine-Learning based botnet detection models. The main purpose of this approach is to provide a better understanding and insights on how improved botnet detection mechanisms can be achieved through enhanced feature extraction and selection methods.

Full Text:



P. Barford and V. Yegneswaran .” An Inside Look at Botnets”, to appear in Series: Advances in Information Security. Springer, 2006

Julian B. Grizzard, V. Sharma, Nunnery C., Kang B.B. & Dagon D. “Peer-to-Peer Botnets: Overview and Case Study”, Proceedings of the First Conference on First work on Hot Topics in Understanding Botnets, 2007,retrieved from https://pdfs.semanticscholar.org/2820/fe12f286700ca9e7937e4cf3d082fb6d1a23.pdf

J. Liu, Y. Xiao, K. Ghaboosi, D. Hongmei, and J. Zhang. Botnet. “Classification, Attacks, Detection, Tracing and Preventing Measures”. Journal on Wireless Communication and Networking, 2009

M. Muhammad, N. Manjinder, and M. Ashraf. “A Survey on Botnet Architectures, Detection and Defences”, International Journal of Network Security, 0(0), PP.1-19,2013

W. Ping, W. Lei, S. Baber & C.Z. Cliff. Analysis of Peer-to-Peer Botnet Attacks and Defences, Department of Electrical Engineering and Computer Science, 2010

O. Katz, R. Perets, Guy Matzliach. Digging Deeper – An In-Depth Analysis of a Fast Flux Network, Akamai White Paper,2006

D.-h Lee,., D.-y Kim,., & J.-i. Jung. Multi-Stage Intrusion Detection System Using Hidden Markov Model Algorithm. International Conference on Information Science and Security , 72-7,2008

D. Santana, S. Suthaharan & S. Mohanty. What we Learn from Learning-Understanding Capabilities and Limitations of Machine Learning in botnet attacks,2018, retrieved from https://arxiv.org/abs/1805,on 26th August 2018

A. Pektas and T. Acarman T. “Effective Feature Selection for Botnet Detection Based on Network Flow Analysis.” International Conference Automatics and Informatics’,2017,

A. Alenazi A., I. Traore , K. Ganame, I. Woungang. “Holistic Model for HTTP Botnet Detection Based on DNS Traffic Analysis”. In: Traore I., Woungang I., Awad A. (eds) Intelligent, Secure, and Dependable Systems in Distributed and Cloud Environments. ISDDC 2017. Lecture Notes in Computer Science, vol 10618. Springer, Cham,2017

E.B. Beigi, H.H. Jazi, N. Stakhanova, A.A. & Ghorbani. “Towards effective feature selection in machine learning-based botnet detection approaches”.2014 IEEE Conference on Communications and Network Security, CNS 2014, 247–255 (2014), https://doi.org/10.1109/CNS.2014.6997492

Sebastian Garcia, Martin Grill, Jan Stiborek and Alejandro Zunino . “An empirical comparison of botnet detection method”, Computers and Security Journal, Elsevier, 45, 100,2014 123. http://dx.doi.org/10.1016/j.cose.2014.05.011

S. Lagraa, J. Francois, A. Lahmadi, M. Miner. BotGM : Unsupervised Graph Mining to Detect Botnets in Traffic Flows, HAL Id : hal-01636480 , 2017

P. Narang, J.M. Reddy, & C. Hota (2013). “Feature selection for detection of peer-to-peer botnet traffic” Compute 2013 - 6th ACM India Computing Convention: Next Generation Computing Paradigms and Technologies. (2013), https://doi.org/10.1145/2522548.2523133

K. Samina, K. Tehmina & N. Shaomila.” A Survey of Feature Selection and Feature Extraction Techniques in Machine Learning”, 2014 Science and Information Conference

C. Hung, & H. Sun.”A Botnet Detection System Based on Machine-Learning using Flow-Based Features”, SECURWARE 2018: The Twelfth International Conference on Emerging Security Information, Systems and Technologies, 122–127.

M. Ved.” Feature Selection and Feature Extraction in Machine Learning: An Overview”, retrieved from https://medium.com/@mehulved1503/feature-selection-and-feature-extraction-in-machine-learning-an-overview-57891c595e96

A. Jović, A., K. Brkić & N. Bogunović.”A review of feature selection methods with applications”. 2015 38th International Convention on Information and Communication Technology, Electronics and Microelectronics, MIPRO 2015 - Proceedings, 1200–1205(2015). https://doi.org/10.1109/MIPRO.2015.7160458

J. Jianguo, B. Qi, S. Zhixin, Y. Wang, & B. Lv. “ Botnet detection method analysis on the effect of feature extraction”, Proceedings - 15th IEEE International Conference on Trust, Security and Privacy in Computing and Communications, 10th IEEE International Conference on Big Data Science and Engineering and 14th IEEE International Symposium on Parallel and Distributed Processing with Applications, IEEE TrustCom/BigDataSE/ISPA 2016, 1882–1888. 2016, https://doi.org/10.1109/TrustCom.2016.0288

M.A. Hall. Correlation-based Feature Selection for Machine Learning, a PhD Thesis at University of Waikato,(1999)

D. Seenivasan and K. Shanthi. Categories of Botnets, World Academy of Science, Engineering and Technology, International Journal of Computer and Systems Engineering 8(9), 1689–1692, 2014

Malowidzki Marek, Berezinski Przemyslaw & Mazur Micha. Network Intrusion Detection: Half a Kingdom for a Good Dataset,https://pdfs.semanticscholar.org/b39e/0f1568d8668d00e4a8bfe1494b5a32a17e17.pdf?_ga=2.237473350.756880770.1576358584-422052986.1572640169, 2015

A. Pektaş, & T. Acarman, T. “ Botnet detection based on network flow summary and deep learning”, International Journal of Network Management, 28(6), 1–15 (2018). https://doi.org/10.1002/nem.2039, 2018

R. Bellman . Dynamic Programming. Princeton, NJ: Princeton University Press, 1957

T. Epelbaum . Deep Learning: Technical Introduction, 2017 September

Z. Yang, & B. Wang (2019).” A Feature Extraction Method for P2P Botnet Detection Using Graphic Symmetry Concept”, .Symmetry, 11(3), 326, https://doi.org/10.3390/sym11030326, 2019

G. Sourek and F. Zeleny,” Efficient Extraction of Network Event Types from NetFlows, Security and Communication Networks, 2019,https://doi.org/10.1155/2019/8954914

L. Mathur, M. Raheja, P. Ahlawat (2018). Botnet Detection via mining of network traffic flow, Procedia Computer Science 132:1668-1677, DOI: 10.1016/j.procs.2018.05.137

P. A. A. Resende, & A.C. Drummond (2018). HTTP and contact-based features for Botnet detection. Security and Privacy, 1(5), e41 (2018). https://doi.org/10.1002/spy2.41

F. Tariq, &S. Baig, (2017). Machine Learning Based Botnet Detection in Software Defined Networks. International Journal of Security and Its Applications, 11(11), 1–12. (2017)https://doi.org/10.14257/ijsia.2017.11.11.01 algorithms. 2017 International Conference on Electronics, Communications and Computers, CONIELECOMP (2017). https://doi.org/10.1109/CONIELECOMP.2017.7891834

F. V. Alejandre, N.C. Cortés & E.A.. Anaya (2017). Feature selection to detect botnets using machine learning

E. B. Beigi, H.H. Jazi, N. Stakhanova , & A.A. Ghorbani (2014). “Towards effective feature

selection in machine learning-based botnet detection approaches”, 2014 IEEE Conference on Communications and Network Security, CNS 2014, 247–255 (2014), https://doi.org/10.1109/CNS.2014.6997492

D. Zhuang,, & J.M. Chang, J. M. “Detecting Peer-to-Peer Botnets through Community Behavior Analysis” 2017 IEEE Conference on Dependable and Secure Computing, 493–500. (2017), http://doi.org/10.1109/DESEC.2017.8073832

A. M. Oyelakin & R. G. Jimoh. “A Review on the Identification Techniques for Detection-Evasive Botnet Malware”, in the proceedings of International Conference of Nigeria Computer Society, Gombe, Nigeria, July 2019

F. Haddadi, D. Runkel, A. NurZincir-Heywood & M.I. Heywood, “On botnet behaviour analysis using GP and C4.5.GECCO”, 2014 - Companion Publication of the 2014 Genetic and Evolutionary Computation Conference, 2014, 1253–1260. https://doi.org/10.1145/2598394.2605435

Y.S. Abu-Mostafa, M. Magdon-Ismal & H.T. Lin. Learning from data. AML Book, 2012

M. Stevanovic & J.M. Pederson. “On the use of Machine learning for identifying botnet network traffic:, Journal of Cyber Security, Vol. 4, 1–32. 2016, doi: 10.13052/jcsm2245-1439.421

DOI: http://dx.doi.org/10.21015/vtcs.v9i1.604


  • There are currently no refbacks.

Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.