SafeCon: AI-Powered Real-Time Cyber Grooming Detection System
DOI:
https://doi.org/10.21015/vtse.v13i2.2118Abstract
In recent times, the rise in online communication has unfortunately led to a significant increase in harmful activities. Countless instances involve people, especially children, becoming victims of distressing experiences like online sexual conversation. Reports suggest that a substantial number of young individuals, approximately one in four, have encountered online harassment or inappropriate content. Additionally, there has been a disturbing surge in cases involving the exploitation of children through grooming and exposure to explicit content. Leveraging the PAN12 dataset, we employ the Universal Sentence Encoder (USE) to generate text embeddings, reduce dimensionality with Principal Component Analysis (PCA), and apply K-means clustering with an optimal number of clusters determined by the Silhouette Score. This approach identifies sexually predatory conversation, enabling real-time moderation to protect users. The system also evaluates performance using a manually labeled dataset, ensuring robust detection of harmful content.
References
AIBA, "Reuters series 50 Leaders of Change," AIBA, Oct. 2022. [Online]. Available: https://aiba.ai/reuters-series-50-leaders-of-change/.
S. Kvam, "Reuters series 50 Leaders of Change," AIBA, Oct. 2022. [Online]. Available: https://aiba.ai/reuters-series-50-leaders-of-change/.
G. Inches and F. Crestani, "PAN12 Deception Detection: Sexual Predator Identification," in CLEF 2012 Labs and Workshops, Notebook Papers, 2012.
S. M. Aarnseth, "Fine tuning BERT for detecting cyber grooming in online chats," 2023.
M. A. Fauzi and S. Wolthusen, "Identifying sexual predators in chats using SVM and feature ensemble," in Proc. Int. Conf. Emerging Trends Networks Comput. Commun., Windhoek, Namibia, 2023.
J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, "BERT: Pre-training of deep bidirectional transformers for language understanding," arXiv preprint arXiv:1810.04805, vol. 01, no. 08, pp. 4171–4186, 2019.
M. Vogt, U. Leser, and A. Akbik, "Early detection of sexual predators in chats," in Proc. 59th Annu. Meeting Assoc. Comput. Linguistics & 11th Int. Joint Conf. Natural Language Processing (Volume 1: Long Papers), 2021.
P. Gupta, P. Kumaraguru, and A. Sureka, "Characterizing pedophile conversations on the internet using online grooming," arXiv preprint arXiv:1208.4324, 2012.
UNICEF, "Protecting children online," UNICEF, Jan. 21, 2022. [Online]. Available: https://www.unicef.org/protection/violence-against-children-online.
I. McGhee, J. Bayzick, A. Kontostathis, L. Edwards, A. McBride, and E. Jakubowski, "Learning to identify internet sexual predation," Int. J. Electronic Commerce, vol. 15, no. 3, pp. 103–122, 2011. DOI: https://doi.org/10.2753/JEC1086-4415150305
GOV.UK, "New AI technique to block online child grooming launched," GOV.UK, Jan. 9, 2020. [Online]. Available: https://www.gov.uk/government/news/new-ai-technique-to-block-online-child-grooming-launched. [Accessed: Jan. 9, 2025].
M. Bakhsh, P. Pir, M. I. Khan, M. Ali, and R. A. Memon, "Optimization of sentiment analysis for e-commerce," VFAST Trans. Software Eng., vol. 12, no. 3, pp. 243–262, 2024. DOI: https://doi.org/10.21015/vtse.v12i3.1907
M. F. A. Cano Basave and A. H. Alani, "Detecting child grooming behaviour patterns on social media," in SociInfo 2014: The 6th Int. Conf. Social Informatics, Barcelona, Spain, 2014. DOI: https://doi.org/10.1007/978-3-319-13734-6_30
C. Cardei and T. Rebedea, "Detecting sexual predators in chats using behavioral features and imbalanced learning," Natural Language Engineering, vol. 23, no. 4, pp. 589–616, 2017. DOI: https://doi.org/10.1017/S1351324916000395
J. Dodge, G. Ilharco, R. Schwartz, A. Farhadi, H. Hajishirzi, and A. N. Smith, "Fine-tuning pretrained language models: Weight initializations, data orders, and early stopping," arXiv:2002.06305, Feb. 2020.
A. Kontostathis, W. West, A. Garron, and K. Reynolds, "Identifying predators using ChatCoder 2.0," in CLEF (Online Working Notes/Labs/Workshop), 2012.
M. Salter and S. Sokolov, "Talk to strangers! Omegle and the political economy of technology-facilitated child sexual exploitation," J. Criminology, vol. 57, no. 1, pp. 121–137, Mar. 2024.
M. Romagna and R. E. Leukfeldt, "Social opportunity structures in hacktivism: Exploring online and offline social ties and the role of offender convergence settings in hacktivist networks," Victims Offenders, Jul. 3, 2024.
J. I. Rodríguez et al., "C 3-Sex: A conversational agent to detect online sex offenders," Electronics, vol. 9, no. 11, p. 1779, 2020.
S. Kalyan, "An attempt to identify cybersex crimes through artificial intelligence," Medium, Apr. 28, 2020. [Online]. Available: https://medium.com/omdena/an-attempt-to-identify-cybersex-crimes-through-artificial-intelligence-238e8f15e8f6. [Accessed: Jan. 5, 2025].
H. J. Escalante et al., "Sexual predator detection in chats with chained classifiers," in Proc. 4th Workshop Comput. Approaches Subjectivity, Sentiment Social Media Analysis, 2013.
G. Inches and F. Crestani, "Overview of the international sexual predator identification competition at PAN-2012," in CLEF (Online Working Notes/Labs/Workshop), vol. 30, 2012.
M. A. Wani, N. Agarwal, and P. Bours, "Sexual-predator detection system based on social behavior biometric features," Procedia Computer Science, vol. 189, pp. 116–127, 2021.
S. Kumbale and S. Singh, "Towards the early detection of child predators in chat rooms: A BERT-based approach," 2023.
Downloads
Published
How to Cite
Issue
Section
License
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License (CC-By) that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).
This work is licensed under a Creative Commons Attribution License CC BY