Role of Logistic Regression in Malware Detection: A Systematic Literature Review

Authors

  • Muhammad Shoaib Farooq University of Management and Technology, Lahore, Pakistan
  • Zeeshan Akram University of Management and Technology, Lahore, Pakistan
  • Atif Alvi University of Management and Technology, Lahore, Pakistan
  • Uzma Omer University of Management and Technology, Lahore, Pakistan

DOI:

https://doi.org/10.21015/vtse.v10i2.963

Abstract

When brain, the first virus known introduced in computer systems, requirement of security was raised. Malware Detection turn out to be more vital when network is used for transferring Secret Information. Nowadays our central attributes i.e., Banking, Agriculture, Robotics, Virtual Social Life, Online Multiplayer Gaming, Private Conversations etc. is practicing internet and Malware will abolish everything if we discount it. Lots of new malwares are located by the passage of time, so we need a reliable, fast and trustworthy machine learning technique to handle them. Logistic Regression Classifier is useable for handling such a huge data, majorly counted in this paper. This is a complete SLR that delivers progressive approach in the field of malware detection. It legally reduces time and the cost of researchers. Limitations and future directions of machine learning classifiers to detect malwares are discussed in this paper.

References

Z. Ren, X. Liu, R. Ye, and T. Zhang, “Security and privacy on internet of things,” in 2017 7th IEEE International Conference on Electronics Information and Emergency Communication (ICEIEC), 2017, pp. 140–144. DOI: https://doi.org/10.1109/ICEIEC.2017.8076530

B. Pal, T. Daniel, R. Chatterjee, and T. Ristenpart, “Beyond credential stuffing: Password similarity models using neural networks,” in 2019 IEEE Symposium on Security and Privacy (SP), 2019, pp. 417–434.

A. Provataki and V. Katos, “Differential malware forensics,” Digital Investigation, vol. 10, no. 4, pp. 311–322, 2013. DOI: https://doi.org/10.1016/j.diin.2013.08.006

R. Islam, R. Tian, L. M. Batten, and S. Versteeg, “Classification of malware based on integrated static and dynamic features,” J. Netw. Comput. Appl., vol. 36, no. 2, pp. 646–656, 2013. DOI: https://doi.org/10.1016/j.jnca.2012.10.004

I. Santos, J. Devesa, F. Brezo, J. Nieves, and P. G. Bringas, “OPEM: A static-dynamic approach for machine-learning-based malware detection,” in Advances in Intelligent Systems and Computing, Berlin, Heidelberg: Springer Berlin Heidelberg, 2013, pp. 271–280. DOI: https://doi.org/10.1007/978-3-642-33018-6_28

A. Mills and P. Legg, “Investigating anti-evasion malware triggers using automated sandbox reconfiguration techniques,” J. Cybersecur. Priv., vol. 1, no. 1, pp. 19–39, 2020.

P. Feng, J. Sun, S. Liu, and K. Sun, “UBER: Combating sandbox evasion via user behavior emulators,” in Information and Communications Security, Cham: Springer International Publishing, 2020, pp. 34–50.

S. Verma, R. Sharma, S. Deb, and D. Maitra, “Artificial intelligence in marketing: Systematic review and future research direction,” International Journal of Information Management Data Insights, vol. 1, no. 1, p. 100002, 2021.

J. Shabbir and T. Anwer, “Artificial Intelligence and its Role in Near Future,” arXiv [cs.AI], 2018.

S. K. Singh, S. Rathore, and J. H. Park, “BlockIoTIntelligence: A blockchain-enabled intelligent IoT architecture with artificial intelligence,” Future Gener. Comput. Syst., vol. 110, pp. 721–743, 2020.

Y. Maleh, Y. Baddi, M. Alazab, L. Tawalbeh, and I. Romdhani, Eds., Artificial Intelligence and Blockchain for Future Cybersecurity Applications, 1st ed. Cham, Switzerland: Springer Nature, 2022.

C. T. Thanh and I. Zelinka, “A survey on artificial intelligence in malware as next-generation threats,” Mendel, vol. 25, no. 2, pp. 27–34, 2019. DOI: https://doi.org/10.13164/mendel.2019.2.027

S. S. Kute, A. K. Tyagi, and S. U. Aswathy, “Security, privacy and trust issues in internet of things and machine learning based e-healthcare,” in Intelligent Interactive Multimedia Systems for e-Healthcare Applications, Singapore: Springer Singapore, 2022, pp. 291–317.

H. S. Galal, Y. B. Mahdy, and M. A. Atiea, “Behavior-based features model for malware detection,” J. comput. virol. hacking tech., vol. 12, no. 2, pp. 59–67, 2016. DOI: https://doi.org/10.1007/s11416-015-0244-0

Farooq, M. S., Khan, S. A., Abid, K., Ahmad, F., Naeem, M. A., Shafiq, M., & Abid, A. (2015). Taxonomy and design considerations for comments in programming languages: a quality perspective. Journal of Quality and Technology Management., vol. 10, no. 2, pp. 167-182, 2015.

M. F. Manzoor, A. Abid, M. S. Farooq, N. A. Azam, and U. Farooq, “Resource allocation techniques in cloud computing: A review and future directions,” Elektron. ir elektrotech., vol. 26, no. 6, pp. 40–51, 2020.

I. A. Khawaja, A. Abid, M. S. Farooq, A. Shahzada, U. Farooq, and K. Abid, “Ad-hoc collaboration space for distributed cross device mobile application development,” IEEE Access, vol. 8, pp. 62800–62814, 2020.

M. S. Farooq, Z. Kalim, J. N. Qureshi, S. Rasheed, and A. Abid, “A blockchain-based framework for distributed agile software development,” IEEE Access, vol. 10, pp. 17977–17995, 2022.

A. Abid, M. S. Farooq, and U. Farooq, “A strategy for the design of introductory computer programming course in high school,” Journal of Elementary Education, vol. 25, no. 1, pp. 145–165, 2015.

R. Tehseen, M. S. Farooq, and A. Abid, “Fuzzy Expert System for earthquake prediction in western Himalayan range,” Elektron. ir elektrotech., vol. 26, no. 3, pp. 4–12, 2020.

A. H. A. Zahid, M.Waji Haider, M. S. Farooq, A. Abid, and A. Ali, “A critical analysis of software failure causes from project management perspectives,” VFAST trans. softw. eng., pp. 113–119, 2018.

M. S. Farooq, M. Khan, and A. Abid, “A framework to make charity collection transparent and auditable using blockchain technology,” Comput. Electr. Eng., vol. 83, no. 106588, p. 106588, 2020.

Farooq, M. S., Khan, S. A., & Abid, A. A framework for the assessment of a first programming language. Journal of Basic and Applied Scientific Research, Vol. 2, no. 8, pp. 8144-8149, 2012.

Farooq, M. S., Abid, A., Khan, S. A., Naeem, M. A., Farooq, A., Abid, K., & Shafiq, M. (2012). A Qualitative Framework for Introducing Programming Language at High School. Journal of Quality and Technology Management, Vol. 8, no. 2, pp. 135-151.

M. Ramzan, M. S. Farooq, A. Zamir, W. Akhtar, M. Ilyas, and H. U. Khan, “An analysis of issues for adoption of cloud computing in telecom industries,” Eng. technol. Appl. sci. res., vol. 8, no. 4, pp. 3157–3161, 2018.

A. A. Shah, M. Khurram Ehsan, K. Ishaq, Z. Ali, and M. S. Farooq. (2018). An efficient hybrid classifier model for anomaly intrusion detection system. IJCSNS, vol. 18 no. 11, pp. 127-136.

A. Naeem, M. S. Farooq, A. Khelifi, and A. Abid, “Malignant melanoma classification using deep learning: Datasets, performance measurements, challenges and opportunities,” IEEE Access, vol. 8, pp. 110575–110597, 2020.

I. Obaid, M. S. Farooq, and A. Abid, “Gamification for recruitment and job training: Model, taxonomy, and challenges,” IEEE Access, vol. 8, pp. 65164–65178, 2020.

O. Aziz, M. S. Farooq, A. Abid, R. Saher, and N. Aslam, “Research trends in enterprise service bus (ESB) applications: A systematic mapping study,” IEEE Access, vol. 8, pp. 31180–31197, 2020.

A. Arooj, M. S. Farooq, T. Umer, and R. U. Shan, “Cognitive Internet of Vehicles and disaster management: A proposed architecture and future direction,” Trans. emerg. telecommun. technol., p. e3625, 2019.

E. Mehmood, A. Abid, M. S. Farooq, and N. A. Nawaz, “Curriculum, teaching and learning, and assessments for introductory programming course,” IEEE Access, vol. 8, pp. 125961–125981, 2020.

A. Arooj, M. S. Farooq, A. Akram, R. Iqbal, A. Sharma, and G. Dhiman, “Big data processing and analysis in internet of vehicles: Architecture, taxonomy, and open research challenges,” Arch. Comput. Methods Eng., vol. 29, no. 2, pp. 793–829, 2022.

R. Tehseen, M. S. Farooq, and A. Abid, “Earthquake prediction using expert systems: A systematic mapping study,” Sustainability, vol. 12, no. 6, p. 2420, 2020.

M. Attique, M. S. Farooq, A. Khelifi, and A. Abid, “Prediction of therapeutic peptides using machine learning: Computational models, datasets, and feature encodings,” IEEE Access, vol. 8, pp. 148570–148594, 2020.

M. S. Farooq, R. Tahseen, and U. Omer, “Ethical guidelines for AI: A systematic literature review,” VFAST trans. softw. eng., vol. 9, no. 3, pp. 33–47, 2021.

A. Abid, W. Ali, M. S. Farooq, U. Farooq, N. S. Khan, and K. Abid, “Semi-automatic classification and duplicate detection from human loss news corpus,” IEEE Access, vol. 8, pp. 97737–97747, 2020.

A. Khelifi, O. Aziz, M. S. Farooq, A. Abid, and F. Bukhari, “Social and economic contribution of 5G and blockchain with green computing: Taxonomy, challenges, and opportunities,” IEEE Access, vol. 9, pp. 69082–69099, 2021.

D. M. Vistro, M. S. Farooq, A. U. Rehman, and S. Malik, “Smart application based blockchain consensus protocols: A systematic mapping study,” in Proceedings of the 3rd International Conference on Integrated Intelligent Computing Communication & Security (ICIIC 2021), 2021.

Farooq, M. S., & Akram, S. IoT IN AGRICULTURE: CHALLENGES AND OPPORTUNITIES. J. Agric. Res, vol. 59 no. 1, pp. 63-87, 2021.

L. A. Haafza, M. J. Awan, A. Abid, A. Yasin, H. Nobanee, and M. S. Farooq, “Big Data COVID-19 systematic literature review: Pandemic crisis,” Electronics (Basel), vol. 10, no. 24, p. 3125, 2021.

A. Rashid, M. S. Farooq, A. Abid, T. Umer, A. K. Bashir, and Y. B. Zikria, “Social media intention mining for sustainable information systems: categories, taxonomy, datasets and challenges,” Complex intell. syst., 2021.

M. S. Farooq et al., “Untangling computer-aided diagnostic system for screening diabetic Retinopathy based on deep learning techniques,” Sensors (Basel), vol. 22, no. 5, p. 1803, 2022.

M. J. Anjum and M. S. Farooq, “SDN based V2X networks for disaster management: A systematic literature review,” VFAST trans. softw. eng., vol. 9, no. 4, pp. 82–91, 2021.

M. Shaheen, M. S. Farooq, T. Umer, and B.-S. Kim, “Applications of federated learning; Taxonomy, challenges, and research trends,” Electronics (Basel), vol. 11, no. 4, p. 670, 2022.

B. J. Kumar, H. Naveen, B. P. Kumar, S. S. Sharma, and J. Villegas, “Logistic regression for polymorphic malware detection using ANOVA F-test,” in 2017 International Conference on Innovations in Information, Embedded and Communication Systems (ICIIECS), 2017. DOI: https://doi.org/10.1109/ICIIECS.2017.8275880

L. Suhuan and H. Xiaojun, “Android malware detection based on logistic regression and XGBoost,” in 2019 IEEE 10th International Conference on Software Engineering and Service Science (ICSESS), 2019.

R. Bapat et al., “Identifying malicious botnet traffic using logistic regression,” in 2018 Systems and Information Engineering Design Symposium (SIEDS), 2018.

M. Masum and H. Shahriar, “Droid-NNet: Deep learning neural network for android malware detection,” in 2019 IEEE International Conference on Big Data (Big Data), 2019.

R. Kumar, K. Sethi, N. Prajapati, R. R. Rout, and P. Bera, “Machine Learning based Malware Detection in Cloud Environment using Clustering Approach,” in 2020 11th International Conference on Computing, Communication and Networking Technologies (ICCCNT), 2020.

H. Cam, “Online detection and control of malware infected assets,” in MILCOM 2017 - 2017 IEEE Military Communications Conference (MILCOM), 2017. DOI: https://doi.org/10.1109/MILCOM.2017.8170869

S. J. Tarsa et al., “Post-silicon CPU adaptation made practical using machine learning,” in Proceedings of the 46th International Symposium on Computer Architecture, 2019, pp. 14–26.

U. Omer, M. S. Farooq, and A. Abid, “Cognitive learning analytics using assessment data and concept map: A framework-based approach for sustainability of programming courses,” Sustainability, vol. 12, no. 17, p. 6990, 2020.

Khan, N. S., Shahzada, A., Ata, S., Abid, A., Farooq, M. S., Mushtaq, M. T., & Khan, I. A vision based approach for Pakistan sign language alphabets recognition vol. 4, no. 10, p. 90, 2014.

A. Arooj, M. S. Farooq, T. Umer, G. Rasool, and B. Wang, “Cyber physical and social networks in IoV (CPSN-IoV): A multimodal architecture in edge-based networks for optimal route selection using 5G technologies,” IEEE Access, vol. 8, pp. 33609–33630, 2020.

B. Hassan, M. S. Farooq, A. Abid, and N. Sabir, “Pakistan sign language: Computer vision analysis & recommendations,” VFAST trans. softw. eng., vol. 9, no. 1, p. 1, 2015. DOI: https://doi.org/10.21015/vtse.v9i1.386

K. N. Khasawneh, N. Abu-Ghazaleh, D. Ponomarev, and L. Yu, “RHMD: Evasion-resilient hardware malware detectors,” in Proceedings of the 50th Annual IEEE/ACM International Symposium on Microarchitecture, 2017. DOI: https://doi.org/10.1145/3123939.3123972

K. N. Khasawneh, M. Ozsoy, C. Donovick, N. Abu-Ghazaleh, and D. Ponomarev, “EnsembleHMD: Accurate hardware malware detectors with specialized ensemble classifiers,” IEEE Trans. Dependable Secure Comput., vol. 17, no. 3, pp. 620–633, 2020. DOI: https://doi.org/10.1109/TDSC.2018.2801858

S. R. Tiwari and R. U. Shukla, “An android malware detection technique using optimized permission and API with PCA,” in 2018 Second International Conference on Intelligent Computing and Control Systems (ICICCS), 2018.

A. Kapoor, H. Kushwaha, and E. Gandotra, “Permission based Android Malicious Application Detection using Machine Learning,” in 2019 International Conference on Signal Processing and Communication (ICSC), 2019.

S. R. Tiwari and R. U. Shukla, “An android malware detection technique based on optimized permissions and API,” in 2018 International Conference on Inventive Research in Computing Applications (ICIRCA), 2018.

S. Lysenko, K. Bobrovnikova, A. Nicheporuk, and R. Shchuka, “SVM-based Technique for Mobile Malware Detection,” Computer Modeling and Intelligent Systems, vol. 2353, pp. 85–97, 2019.

S. Sasaki, S. Hidano, T. Uchibayashi, T. Suganuma, M. Hiji, and S. Kiyomoto, “On embedding backdoor in malware detectors using machine learning,” in 2019 17th International Conference on Privacy, Security and Trust (PST), 2019.

Z. Abaid, M. A. Kaafar, and S. Jha, “Quantifying the impact of adversarial evasion attacks on machine learning based android malware classifiers,” in 2017 IEEE 16th International Symposium on Network Computing and Applications (NCA), 2017. DOI: https://doi.org/10.1109/NCA.2017.8171381

O. P. Samantray and S. Narayan Tripathy, “A Knowledge-Domain Analyser for Malware Classification,” in 2020 International Conference on Computer Science, Engineering and Applications (ICCSEA), 2020.

X. Wang and C. Li, “KerTSDroid: Detecting android malware at scale through kernel task structures,” in 2019 IEEE 25th International Conference on Parallel and Distributed Systems (ICPADS), 2019.

A. Yeboah-Ofori and C. Boachie, “Malware attack predictive analytics in a cyber supply chain context using machine learning,” in 2019 International Conference on Cyber Security and Internet of Things (ICSIoT), 2019.

T. Y. Win, H. Tianfield, and Q. Mair, “Big data based security analytics for protecting virtualized infrastructures in cloud computing,” IEEE Trans. Big Data, vol. 4, no. 1, pp. 11–25, 2018. DOI: https://doi.org/10.1109/TBDATA.2017.2715335

N. Krishnan and A. Salim, “Machine learning based intrusion detection for virtualized infrastructures,” in 2018 International CET Conference on Control, Communication, and Computing (IC4), 2018.

M. A. Ali, D. Svetinovic, Z. Aung, and S. Lukman, “Malware detection in android mobile platform using machine learning algorithms,” in 2017 International Conference on Infocom Technologies and Unmanned Systems (Trends and Future Directions) (ICTUS), 2017. DOI: https://doi.org/10.1109/ICTUS.2017.8286109

A. Kumar Gupta, A. Choudhary, and P. Chauhan, “Securing virtual infrstructure in cloud computing using big data analytics,” in 2018 International Conference on Advances in Computing, Communication Control and Networking (ICACCCN), 2018.

M. Amin, B. Shah, A. Sharif, T. Ali, K.-I. Kim, and S. Anwar, “Android malware detection through generative adversarial networks,” Trans. emerg. telecommun. technol., vol. 33, no. 2, 2022.

S. I. Bae, G. B. Lee, and E. G. Im, “Ransomware detection using machine learning algorithms: Ransomware detection using machine learning algorithms,” Concurr. Comput., vol. 32, no. 18, p. e5422, 2020.

Deepa, Radhamani, Vinod, M. Shojafar, N. Kumar, and M. Conti, “Identification of Android malware using refined system calls,” Concurr. Comput., vol. 31, no. 20, p. e5311, 2019.

G. von Laszewski, I. Foster, J. Gawor, and P. Lane, “A Java commodity grid kit,” Concurr. Comput., vol. 13, no. 8–9, pp. 645–662, 2001. DOI: https://doi.org/10.1002/cpe.572

D. K. K. Reddy, H. S. Behera, J. Nayak, B. Naik, U. Ghosh, and P. K. Sharma, “Exact greedy algorithm based split finding approach for intrusion detection in fog-enabled IoT environment,” J. Inf. Secur. Appl., vol. 60, no. 102866, p. 102866, 2021.

F. Ullah, J. Wang, M. Farhan, M. Habib, and S. Khalid, “Software plagiarism detection in multiprogramming languages using machine learning approach,” Concurr. Comput., vol. 33, no. 4, p. e5000, 2021.

Y. Li, K. Xiong, T. Chin, and C. Hu, “A machine learning framework for domain generation algorithm-based malware detection,” IEEE Access, vol. 7, pp. 32765–32782, 2019.

W. Deng, Y. Peng, F. Yang, and J. Song, “Feature optimization and hybrid classification for malicious web page detection,” Concurr. Comput., 2020.

R. Pascanu, J. W. Stokes, H. Sanossian, M. Marinescu, and A. Thomas, “Malware classification with recurrent networks,” in 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2015. DOI: https://doi.org/10.1109/ICASSP.2015.7178304

B. Li and Y. Vorobeychik, “Evasion-robust classification on binary domains,” ACM Trans. Knowl. Discov. Data, vol. 12, no. 4, pp. 1–32, 2018.

F. Pierazzi, G. Mezzour, Q. Han, M. Colajanni, and V. S. Subrahmanian, “A data-driven characterization of modern Android spyware,” ACM Trans. Manag. Inf. Syst., vol. 11, no. 1, pp. 1–38, 2020.

Downloads

Additional Files

Published

2022-05-15

How to Cite

Farooq, M. S., Akram, Z., Alvi, A., & Omer, U. (2022). Role of Logistic Regression in Malware Detection: A Systematic Literature Review. VFAST Transactions on Software Engineering, 10(2), 36–46. https://doi.org/10.21015/vtse.v10i2.963

Issue

Section

Articles