SafeCon: AI-Powered Real-Time Cyber Grooming Detection System

Authors

DOI:

https://doi.org/10.21015/vtse.v13i2.2118

Abstract

In recent times, the rise in online communication has unfortunately led to a significant increase in harmful activities. Countless instances involve people, especially children, becoming victims of distressing experiences like online sexual conversation. Reports suggest that a substantial number of young individuals, approximately one in four, have encountered online harassment or inappropriate content. Additionally, there has been a disturbing surge in cases involving the exploitation of children through grooming and exposure to explicit content. Leveraging the PAN12 dataset, we employ the Universal Sentence Encoder (USE) to generate text embeddings, reduce dimensionality with Principal Component Analysis (PCA), and apply K-means clustering with an optimal number of clusters determined by the Silhouette Score. This approach identifies sexually predatory conversation, enabling real-time moderation to protect users. The system also evaluates performance using a manually labeled dataset, ensuring robust detection of harmful content.

References

AIBA, "Reuters series 50 Leaders of Change," AIBA, Oct. 2022. [Online]. Available: https://aiba.ai/reuters-series-50-leaders-of-change/.

S. Kvam, "Reuters series 50 Leaders of Change," AIBA, Oct. 2022. [Online]. Available: https://aiba.ai/reuters-series-50-leaders-of-change/.

G. Inches and F. Crestani, "PAN12 Deception Detection: Sexual Predator Identification," in CLEF 2012 Labs and Workshops, Notebook Papers, 2012.

S. M. Aarnseth, "Fine tuning BERT for detecting cyber grooming in online chats," 2023.

M. A. Fauzi and S. Wolthusen, "Identifying sexual predators in chats using SVM and feature ensemble," in Proc. Int. Conf. Emerging Trends Networks Comput. Commun., Windhoek, Namibia, 2023.

J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, "BERT: Pre-training of deep bidirectional transformers for language understanding," arXiv preprint arXiv:1810.04805, vol. 01, no. 08, pp. 4171–4186, 2019.

M. Vogt, U. Leser, and A. Akbik, "Early detection of sexual predators in chats," in Proc. 59th Annu. Meeting Assoc. Comput. Linguistics & 11th Int. Joint Conf. Natural Language Processing (Volume 1: Long Papers), 2021.

P. Gupta, P. Kumaraguru, and A. Sureka, "Characterizing pedophile conversations on the internet using online grooming," arXiv preprint arXiv:1208.4324, 2012.

UNICEF, "Protecting children online," UNICEF, Jan. 21, 2022. [Online]. Available: https://www.unicef.org/protection/violence-against-children-online.

I. McGhee, J. Bayzick, A. Kontostathis, L. Edwards, A. McBride, and E. Jakubowski, "Learning to identify internet sexual predation," Int. J. Electronic Commerce, vol. 15, no. 3, pp. 103–122, 2011. DOI: https://doi.org/10.2753/JEC1086-4415150305

GOV.UK, "New AI technique to block online child grooming launched," GOV.UK, Jan. 9, 2020. [Online]. Available: https://www.gov.uk/government/news/new-ai-technique-to-block-online-child-grooming-launched. [Accessed: Jan. 9, 2025].

M. Bakhsh, P. Pir, M. I. Khan, M. Ali, and R. A. Memon, "Optimization of sentiment analysis for e-commerce," VFAST Trans. Software Eng., vol. 12, no. 3, pp. 243–262, 2024. DOI: https://doi.org/10.21015/vtse.v12i3.1907

M. F. A. Cano Basave and A. H. Alani, "Detecting child grooming behaviour patterns on social media," in SociInfo 2014: The 6th Int. Conf. Social Informatics, Barcelona, Spain, 2014. DOI: https://doi.org/10.1007/978-3-319-13734-6_30

C. Cardei and T. Rebedea, "Detecting sexual predators in chats using behavioral features and imbalanced learning," Natural Language Engineering, vol. 23, no. 4, pp. 589–616, 2017. DOI: https://doi.org/10.1017/S1351324916000395

J. Dodge, G. Ilharco, R. Schwartz, A. Farhadi, H. Hajishirzi, and A. N. Smith, "Fine-tuning pretrained language models: Weight initializations, data orders, and early stopping," arXiv:2002.06305, Feb. 2020.

A. Kontostathis, W. West, A. Garron, and K. Reynolds, "Identifying predators using ChatCoder 2.0," in CLEF (Online Working Notes/Labs/Workshop), 2012.

M. Salter and S. Sokolov, "Talk to strangers! Omegle and the political economy of technology-facilitated child sexual exploitation," J. Criminology, vol. 57, no. 1, pp. 121–137, Mar. 2024.

M. Romagna and R. E. Leukfeldt, "Social opportunity structures in hacktivism: Exploring online and offline social ties and the role of offender convergence settings in hacktivist networks," Victims Offenders, Jul. 3, 2024.

J. I. Rodríguez et al., "C 3-Sex: A conversational agent to detect online sex offenders," Electronics, vol. 9, no. 11, p. 1779, 2020.

S. Kalyan, "An attempt to identify cybersex crimes through artificial intelligence," Medium, Apr. 28, 2020. [Online]. Available: https://medium.com/omdena/an-attempt-to-identify-cybersex-crimes-through-artificial-intelligence-238e8f15e8f6. [Accessed: Jan. 5, 2025].

H. J. Escalante et al., "Sexual predator detection in chats with chained classifiers," in Proc. 4th Workshop Comput. Approaches Subjectivity, Sentiment Social Media Analysis, 2013.

G. Inches and F. Crestani, "Overview of the international sexual predator identification competition at PAN-2012," in CLEF (Online Working Notes/Labs/Workshop), vol. 30, 2012.

M. A. Wani, N. Agarwal, and P. Bours, "Sexual-predator detection system based on social behavior biometric features," Procedia Computer Science, vol. 189, pp. 116–127, 2021.

S. Kumbale and S. Singh, "Towards the early detection of child predators in chat rooms: A BERT-based approach," 2023.

Downloads

Published

2025-05-04

How to Cite

Sallahudin, S., Muhammad Ismail, Ali, S., Ahmed, A., & Faizan Hameed, M. (2025). SafeCon: AI-Powered Real-Time Cyber Grooming Detection System. VFAST Transactions on Software Engineering, 13(2), 44–55. https://doi.org/10.21015/vtse.v13i2.2118

Issue

Section

Articles