A Survey of Feature Extraction and Feature Selection Techniques used in Machine Learning-Based Botnet Detection Schemes
DOI:
https://doi.org/10.21015/vtcs.v9i1.604Abstract
Machine learning techniques have been widely used for the classification of botnets as they have been argued to have improved strengths compared to signature-based approaches. The level of performances of some of these detection schemes have been traced to the relevance of the features used for the classification models. Therefore, extraction and selection of the most discriminative features in the classification of botnets is an important research area. It has equally been found that when a Machine-learning based approach is being used to identify botnets, the dataset chosen has to be real and representative. Feature extraction and Feature Selection are necessary steps prior to using a Machine Learning-based classification algorithm for identifying botnets. The reason for the pre-processing and feature selection steps in Machine Learning-based model is to be able to remove irrelevant and redundant data in the experimental dataset, minimize computational complexity, and increase both model simplicity as well as accuracy. This paper provided a survey of various feature extraction and feature selection methods that have been used by researchers that proposed Machine-Learning based botnet detection models. The main purpose of this approach is to provide a better understanding and insights on how improved botnet detection mechanisms can be achieved through enhanced feature extraction and selection methods.
References
P. Barford and V. Yegneswaran .” An Inside Look at Botnets”, to appear in Series: Advances in Information Security. Springer, 2006
Julian B. Grizzard, V. Sharma, Nunnery C., Kang B.B. & Dagon D. “Peer-to-Peer Botnets: Overview and Case Study”, Proceedings of the First Conference on First work on Hot Topics in Understanding Botnets, 2007,retrieved from https://pdfs.semanticscholar.org/2820/fe12f286700ca9e7937e4cf3d082fb6d1a23.pdf
J. Liu, Y. Xiao, K. Ghaboosi, D. Hongmei, and J. Zhang. Botnet. “Classification, Attacks, Detection, Tracing and Preventing Measures”. Journal on Wireless Communication and Networking, 2009
M. Muhammad, N. Manjinder, and M. Ashraf. “A Survey on Botnet Architectures, Detection and Defences”, International Journal of Network Security, 0(0), PP.1-19,2013
W. Ping, W. Lei, S. Baber & C.Z. Cliff. Analysis of Peer-to-Peer Botnet Attacks and Defences, Department of Electrical Engineering and Computer Science, 2010
O. Katz, R. Perets, Guy Matzliach. Digging Deeper – An In-Depth Analysis of a Fast Flux Network, Akamai White Paper,2006
D.-h Lee,., D.-y Kim,., & J.-i. Jung. Multi-Stage Intrusion Detection System Using Hidden Markov Model Algorithm. International Conference on Information Science and Security , 72-7,2008
D. Santana, S. Suthaharan & S. Mohanty. What we Learn from Learning-Understanding Capabilities and Limitations of Machine Learning in botnet attacks,2018, retrieved from https://arxiv.org/abs/1805,on 26th August 2018
A. Pektas and T. Acarman T. “Effective Feature Selection for Botnet Detection Based on Network Flow Analysis.” International Conference Automatics and Informatics’,2017,
A. Alenazi A., I. Traore , K. Ganame, I. Woungang. “Holistic Model for HTTP Botnet Detection Based on DNS Traffic Analysis”. In: Traore I., Woungang I., Awad A. (eds) Intelligent, Secure, and Dependable Systems in Distributed and Cloud Environments. ISDDC 2017. Lecture Notes in Computer Science, vol 10618. Springer, Cham,2017
E.B. Beigi, H.H. Jazi, N. Stakhanova, A.A. & Ghorbani. “Towards effective feature selection in machine learning-based botnet detection approaches”.2014 IEEE Conference on Communications and Network Security, CNS 2014, 247–255 (2014), https://doi.org/10.1109/CNS.2014.6997492
Sebastian Garcia, Martin Grill, Jan Stiborek and Alejandro Zunino . “An empirical comparison of botnet detection method”, Computers and Security Journal, Elsevier, 45, 100,2014 123. http://dx.doi.org/10.1016/j.cose.2014.05.011
S. Lagraa, J. Francois, A. Lahmadi, M. Miner. BotGM : Unsupervised Graph Mining to Detect Botnets in Traffic Flows, HAL Id : hal-01636480 , 2017
P. Narang, J.M. Reddy, & C. Hota (2013). “Feature selection for detection of peer-to-peer botnet traffic” Compute 2013 - 6th ACM India Computing Convention: Next Generation Computing Paradigms and Technologies. (2013), https://doi.org/10.1145/2522548.2523133
K. Samina, K. Tehmina & N. Shaomila.” A Survey of Feature Selection and Feature Extraction Techniques in Machine Learning”, 2014 Science and Information Conference
C. Hung, & H. Sun.”A Botnet Detection System Based on Machine-Learning using Flow-Based Features”, SECURWARE 2018: The Twelfth International Conference on Emerging Security Information, Systems and Technologies, 122–127.
M. Ved.” Feature Selection and Feature Extraction in Machine Learning: An Overview”, retrieved from https://medium.com/@mehulved1503/feature-selection-and-feature-extraction-in-machine-learning-an-overview-57891c595e96
A. Jović, A., K. Brkić & N. Bogunović.”A review of feature selection methods with applications”. 2015 38th International Convention on Information and Communication Technology, Electronics and Microelectronics, MIPRO 2015 - Proceedings, 1200–1205(2015). https://doi.org/10.1109/MIPRO.2015.7160458
J. Jianguo, B. Qi, S. Zhixin, Y. Wang, & B. Lv. “ Botnet detection method analysis on the effect of feature extraction”, Proceedings - 15th IEEE International Conference on Trust, Security and Privacy in Computing and Communications, 10th IEEE International Conference on Big Data Science and Engineering and 14th IEEE International Symposium on Parallel and Distributed Processing with Applications, IEEE TrustCom/BigDataSE/ISPA 2016, 1882–1888. 2016, https://doi.org/10.1109/TrustCom.2016.0288
M.A. Hall. Correlation-based Feature Selection for Machine Learning, a PhD Thesis at University of Waikato,(1999)
D. Seenivasan and K. Shanthi. Categories of Botnets, World Academy of Science, Engineering and Technology, International Journal of Computer and Systems Engineering 8(9), 1689–1692, 2014
Malowidzki Marek, Berezinski Przemyslaw & Mazur Micha. Network Intrusion Detection: Half a Kingdom for a Good Dataset,https://pdfs.semanticscholar.org/b39e/0f1568d8668d00e4a8bfe1494b5a32a17e17.pdf?_ga=2.237473350.756880770.1576358584-422052986.1572640169, 2015
A. Pektaş, & T. Acarman, T. “ Botnet detection based on network flow summary and deep learning”, International Journal of Network Management, 28(6), 1–15 (2018). https://doi.org/10.1002/nem.2039, 2018
R. Bellman . Dynamic Programming. Princeton, NJ: Princeton University Press, 1957
T. Epelbaum . Deep Learning: Technical Introduction, 2017 September
Z. Yang, & B. Wang (2019).” A Feature Extraction Method for P2P Botnet Detection Using Graphic Symmetry Concept”, .Symmetry, 11(3), 326, https://doi.org/10.3390/sym11030326, 2019
G. Sourek and F. Zeleny,” Efficient Extraction of Network Event Types from NetFlows, Security and Communication Networks, 2019,https://doi.org/10.1155/2019/8954914
L. Mathur, M. Raheja, P. Ahlawat (2018). Botnet Detection via mining of network traffic flow, Procedia Computer Science 132:1668-1677, DOI: 10.1016/j.procs.2018.05.137
P. A. A. Resende, & A.C. Drummond (2018). HTTP and contact-based features for Botnet detection. Security and Privacy, 1(5), e41 (2018). https://doi.org/10.1002/spy2.41
F. Tariq, &S. Baig, (2017). Machine Learning Based Botnet Detection in Software Defined Networks. International Journal of Security and Its Applications, 11(11), 1–12. (2017)https://doi.org/10.14257/ijsia.2017.11.11.01 algorithms. 2017 International Conference on Electronics, Communications and Computers, CONIELECOMP (2017). https://doi.org/10.1109/CONIELECOMP.2017.7891834
F. V. Alejandre, N.C. Cortés & E.A.. Anaya (2017). Feature selection to detect botnets using machine learning
E. B. Beigi, H.H. Jazi, N. Stakhanova , & A.A. Ghorbani (2014). “Towards effective feature
selection in machine learning-based botnet detection approaches”, 2014 IEEE Conference on Communications and Network Security, CNS 2014, 247–255 (2014), https://doi.org/10.1109/CNS.2014.6997492
D. Zhuang,, & J.M. Chang, J. M. “Detecting Peer-to-Peer Botnets through Community Behavior Analysis” 2017 IEEE Conference on Dependable and Secure Computing, 493–500. (2017), http://doi.org/10.1109/DESEC.2017.8073832
A. M. Oyelakin & R. G. Jimoh. “A Review on the Identification Techniques for Detection-Evasive Botnet Malware”, in the proceedings of International Conference of Nigeria Computer Society, Gombe, Nigeria, July 2019
F. Haddadi, D. Runkel, A. NurZincir-Heywood & M.I. Heywood, “On botnet behaviour analysis using GP and C4.5.GECCO”, 2014 - Companion Publication of the 2014 Genetic and Evolutionary Computation Conference, 2014, 1253–1260. https://doi.org/10.1145/2598394.2605435
Y.S. Abu-Mostafa, M. Magdon-Ismal & H.T. Lin. Learning from data. AML Book, 2012
M. Stevanovic & J.M. Pederson. “On the use of Machine learning for identifying botnet network traffic:, Journal of Cyber Security, Vol. 4, 1–32. 2016, doi: 10.13052/jcsm2245-1439.421
Downloads
Published
How to Cite
Issue
Section
License
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License (CC-By) that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).
This work is licensed under a Creative Commons Attribution License CC BY