FedSecureNLP: an Advanced Federated Learning approach for privacy-preserving and secure Ecommerce NLP System

Fayyaz   Ali; Qurban Ali; Farhan Bashir Shaikh; Raja Sohail Ahmed  Larik; Muhammad  Saad; Muhammad Maaz Akhter

doi:10.21015/vtcs.v14i1.2398

Authors

Fayyaz Ali Department of Software Engineering, Sir Syed University of Engineering and Technology, Karachi, Sindh, Pakistan https://orcid.org/0000-0003-3232-5363
Qurban Ali Department of Computer Science Sindh Madressatul Islam University Karachi, Sindh, Pakistan https://orcid.org/0009-0003-0087-9270
Farhan Bashir Shaikh Department of Computer Science, The University of Larkano, Larkana, Sindh, Pakistan https://orcid.org/0000-0002-1221-1791
Raja Sohail Ahmed Larik School of Computer Science, Hebei International Studies University, Hebei, P.R.China https://orcid.org/0000-0002-7841-6853
Muhammad Saad Department of Computer Science Sindh Madressatul Islam University Karachi, Sindh, Pakistan https://orcid.org/0009-0004-3768-110X
Muhammad Maaz Akhter Department of Computer Science, FAST National University of Computer and Emerging Sciences, Karachi, Sindh, Pakistan https://orcid.org/0009-0006-2425-2976

DOI:

https://doi.org/10.21015/vtcs.v14i1.2398

Abstract

This research presents an advanced Federated Learning approach for privacy-preserving and secure Ecommerce NLP System. Recently, the amount of information is growing very fast, and its importance is also rising quickly. Today, businesses handle huge volumes of data, like emails, which are often key to how they run day to day. It's difficult to find useful information from all this data and keeping it safe is a big worry. Keeping data secure is very important for protecting the trust of an online store and making customers feel safe when they make purchases. Hence, in this paper we have developed a privacy-preserving and secure FedSecureNLP system for text summarization, sentiment analysis and question answering. Within the FL configuration, the local training text summarization, sentiment predication and question answering is performed autonomously on each node local dataset. The Bidirectional Encoder Representations from Transformers (BERT) tokenization is used in the model to create tokens and extract term from amazon review data, after that, hierarchical deep learning for text (HDLTex) is used to predict sentiment rating. The Deep learning network (DeepCNN) is then used to do extractive summarization. Random multimodal deep learning (RMDL) is used for QA prediction. The QA model has different training and testing steps in the Module. During the evaluation phase, pretrained RMDL process the input query and semantically analyzed and generates relevant response. In this case FCDVO is designed to adjust the HDLTex, Deep CNN, and RMDL hyperparameters. The precision, recall, F-measure and Root Mean Square Error (RMSE) of the System that presented were 0.088, 92.896%, 92.481% and 92.688% respectively

References

B. Nagy, I. Hegedűs, N. Sándor, B. Egedi, H. Mehmood, K. Saravanan, and Á. Kiss, "Privacy-preserving Federated Learning and its application to natural language processing," Knowledge-Based Systems, vol. 268, p. 110475, 2023.

R. Kumar, C. S. Shieh, P. Chakrabarti, A. Kumar, J. Moolchandani, and R. Sinha, "Privacy-Preserving Federated Learning in Healthcare, E-Commerce, and Finance: A Taxonomy of Security Threats and Mitigation Strategies," in EPJ Web of Conferences, vol. 328, p. 01066, EDP Sciences, 2025.

A. K. Yadav, A. K. Maurya, and R. S. Yadav, "Extractive Text Summarization Using Recent Approaches: A Survey," Ingénierie des Systèmes d'Information, vol. 26, no. 1, 2021.

R. Pan, J. Wang, L. Kong, Z. Huang, and J. Xiao, "Personalized Federated Learning via Gradient Modulation for Heterogeneous Text Summarization," in 2023 International Joint Conference on Neural Networks (IJCNN), IEEE, 2023, pp. 1-7.

Y. Kang, Z. Cai, C. W. Tan, Q. Huang, and H. Liu, "Natural language processing (NLP) in management research: A literature review," Journal of Management Analytics, vol. 7, no. 2, pp. 139-172, 2020.

M. Saranya and B. Amutha, "FLMatchQA: a recursive neural network-based question answering with customized federated learning model," PeerJ Computer Science, vol. 10, p. e2092, 2024.

M. Liu, S. Ho, M. Wang, L. Gao, Y. Jin, and H. Zhang, "Federated learning meets natural language processing: A survey," arXiv preprint arXiv:2107.12603, 2021.

B. Y. Lin, C. He, Z. Zeng, H. Wang, Y. Huang, C. Dupuy, R. Gupta, M. Soltanolkotabi, X. Ren, and S. Avestimehr, "Fednlp: Benchmarking federated learning methods for natural language processing tasks," arXiv preprint arXiv:2104.08815, 2021.

M. Yang, Q. Qu, Y. Shen, K. Lei, and J. Zhu, "Cross-domain aspect/sentiment-aware abstractive review summarization by combining topic modeling and deep reinforcement learning," Neural Computing and Applications, vol. 32, no. 11, pp. 6421-6433, 2020.

A. Kumar, S. Seth, S. Gupta, and S. Maini, "Sentic computing for aspect-based opinion summarization using multi-head attention with feature pooled pointer generator network," Cognitive Computation, vol. 14, no. 1, pp. 130-148, 2022.

M. Zhong, P. Liu, Y. Chen, D. Wang, X. Qiu, and X. Huang, "Extractive summarization as text matching," arXiv preprint arXiv:2004.08795, 2020.

W. Xiao and G. Carenini, "Extractive summarization of long documents by combining global and local context," arXiv preprint arXiv:1909.08089, 2019.

M. Sarrouti and S. O. El Alaoui, "SemBioNLQA: A semantic biomedical question answering system for retrieving exact and ideal answers to natural language questions," Artificial Intelligence in Medicine, vol. 102, p. 101767, 2020.

D. Su, Y. Xu, G. I. Winata, P. Xu, H. Kim, Z. Liu, and P. Fung, "Generalizing question answering system with pre-trained language model fine-tuning," in Proceedings of the 2nd Workshop on Machine Reading for Question Answering, Nov. 2019, pp. 203-211.

D. Diefenbach, A. Both, K. Singh, and P. Maret, "Towards a question answering system over the semantic web," Semantic Web, vol. 11, no. 3, pp. 421-439, 2020.

A. Bompotas, A. Ilias, M. Adamopoulos, A. Kanavos, C. Makris, G. Rompolas, and A. Savvopoulos, "A Sentiment-based Hotel Review Summarization using LSTM Neural Networks," in 2020 11th International Conference on Information, Intelligence, Systems and Applications (IISA), IEEE, Jul. 2020, pp. 1-7.

J. Devlin, M. W. Chang, K. Lee, and K. Toutanova, "BERT: Pre-training of deep bidirectional transformers for language understanding," arXiv preprint arXiv:1810.04805, 2018.

H. Yang, B. Zeng, J. Yang, Y. Song, and R. Xu, "A multi-task learning model for chinese-oriented aspect polarity classification and aspect term extraction," Neurocomputing, vol. 419, pp. 344-356, 2021.

K. Kowsari, D. E. Brown, M. Heidarysafa, K. J. Meimandi, M. S. Gerber, and L. E. Barnes, "HDLTex: Hierarchical deep learning for text classification," in Proceedings of 2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA), IEEE, Dec. 2017, pp. 364-371.

S. Abdulhameed and T. A. Rashid, "Child drawing development optimization algorithm based on child's cognitive development," Arabian Journal for Science and Engineering, vol. 47, no. 2, pp. 1337-1351, 2022.

B. Abdollahzadeh, F. S. Gharehchopogh, and S. Mirjalili, "African vultures optimization algorithm: A new nature-inspired metaheuristic algorithm for global optimization problems," Computers & Industrial Engineering, vol. 158, p. 107408, 2021.

P. R. Bhaladhare and D. C. Jinwala, "A clustering approach for the l-diversity model in privacy preserving data mining using fractional calculus-bacterial foraging optimization algorithm," Advances in Computer Engineering, vol. 2014, no. 1, p. 396529, 2014.

F. Tu, S. Yin, P. Ouyang, S. Tang, L. Liu, and S. Wei, "Deep convolutional neural network architecture with reconfigurable computation patterns," IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 25, no. 8, pp. 2220-2233, 2017.

K. Kowsari, M. Heidarysafa, D. E. Brown, K. J. Meimandi, and L. E. Barnes, "RMDL: Random multimodel deep learning for classification," in Proceedings of the 2nd International Conference on Information System and Data Mining, Apr. 2018, pp. 19-28.

"The Amazon Review Data," 2018. [Online]. Available: https://cseweb.ucsd.edu/~jmcauley/datasets/amazon_v2/

"The Amazon question/answer data," [Online]. Available: https://cseweb.ucsd.edu/~jmcauley/datasets/amazon/qa/

D. Chicco, M. J. Warrens, and G. Jurman, "The coefficient of determination R-squared is more informative than SMAPE, MAE, MAPE, MSE and RMSE in regression analysis evaluation," PeerJ Computer Science, vol. 7, p. e623, 2021.

H. K. Bhuyan and V. Ravi, "An integrated framework with deep learning for segmentation and classification of cancer disease," International Journal on Artificial Intelligence Tools, vol. 32, no. 2, p. 2340002, 2023.

Y. Wang, J. Zhang, Z. Yang, B. Wang, J. Jin, and Y. Liu, "Improving extractive summarization with semantic enhancement through topic-injection based BERT model," Information Processing & Management, vol. 61, no. 3, p. 103677, 2024.

R. Jia, Y. Cao, H. Tang, F. Fang, C. Cao, and S. Wang, "Neural extractive summarization with hierarchical attentive heterogeneous graph network," in Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), Nov. 2020, pp. 3622-3631.

W. Yu, L. Wu, Y. Deng, R. Mahindru, Q. Zeng, S. Guven, and M. Jiang, "A technical question answering system with transfer learning," in Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, Oct. 2020, pp. 92-99.

C. K. Behera, D. Lakshmi, and I. Kondurkar, "Enhancing user privacy in natural language processing (NLP) systems: Techniques and frameworks for privacy-preserving solutions," in Advanced Applications of Generative AI and Natural Language Processing Models, IGI Global Scientific Publishing, 2024, pp. 159-185.

J. Sachdev, S. D Rosario, A. Phatak, H. Wen, S. Kirti, and C. Tripathy, "Automated Query-Product Relevance Labeling using Large Language Models for E-commerce Search," in Proceedings of the 2024 8th International Conference on Natural Language Processing and Information Retrieval, Dec. 2024, pp. 32-40.

I. A. Kandhro, S. Wasi, K. Kumar, M. Rind, and M. Ameen, "Sentiment analysis of students' comment using long-short term model," Indian Journal of Science and Technology, vol. 12, no. 8, pp. 1-16, 2019.

C. Yilmaz and S. Sadic, "FedEnsemble: federated learning model for efficient sentiment analysis," Computing, Springer, 2025. doi: 10.1007/s00607-025-01592-y.

A. Hilmkil, S. Callh, M. Bartolini, T. Hard, and L. Slettengren, "FedQAS: Privacy-aware machine reading comprehension with federated learning," Applied Sciences, vol. 11, no. 6, p. 2765, 2021. arXiv:2202.04742.

S. S. Khalil, N. S. Tawfik, and M. Spruit, "Federated learning for privacy-preserving depression detection with multilingual language models in social media posts," Patterns, vol. 5, no. 5, p. 100990, 2024.

N. Jahan, J. Ahamed, and D. Nandi, "Enhancing E-commerce Sentiment Analysis with Advanced BERT Techniques," IJIEEB, vol. 17, no. 3, pp. 49-61, 2025. doi: 10.5815/ijieeb.2025.03.04.

M. R. R. Rana, A. Nawaz, S. U. Rehman, et al., "BERT-BiGRU-Senti-GCN: An Advanced NLP Framework for Analyzing Customer Sentiments in E-Commerce," International Journal of Computational Intelligence Systems, vol. 18, p. 21, 2025. doi: 10.1007/s44196-025-00747-1.

V. Wagh, S. Pande, V. Thakare, and P. Chatur, "Enhancing Product Design through AI-Driven Sentiment Analysis of Amazon Reviews Using BERT," Algorithms, vol. 17, no. 2, p. 59, 2024. doi: 10.3390/a17020059.

FedSecureNLP: an Advanced Federated Learning approach for privacy-preserving and secure Ecommerce NLP System

Authors

DOI:

Abstract

References

Downloads

Published

How to Cite

Issue

Section

License

Developed By

Information