Explainable BERT Models for Rumor Detection on Social Media

Aziz Ahmad; Shams Ur Rahman; Spogmay Yousafzai; Ghulam Hafeez

doi:10.21015/vtse.v14i1.2275

Authors

Aziz Ahmad Department of Computer Software Engineering, University of Engineering & Technology, Mardan, Pakistan https://orcid.org/0000-0003-4320-1614
Shams Ur Rahman Department of Computer Software Engineering, University of Engineering & Technology, Mardan, Pakistan https://orcid.org/0009-0006-0351-1166
Spogmay Yousafzai Department of Computer Software Engineering, University of Engineering & Technology, Mardan, Pakistan https://orcid.org/0009-0009-2683-1623
Ghulam Hafeez Department of Electrical Engineering, University of Engineering & Technology, Mardan, Pakistan https://orcid.org/0000-0002-9398-9414

DOI:

https://doi.org/10.21015/vtse.v14i1.2275

Abstract

The rapid spread of unverified information and rumors across social media platforms poses serious risks to public health, economic stability, and societal trust. Researchers developed specific machine learning and deep learning models for automated rumor detection because of the increasing challenge that these platforms posed. The existing automated systems operate through uninterpretable black box operations thus users find it hard to trust these systems or understand how they work. This study addresses these limitations by developing explainable framework for rumor detection through transformer-based models along with advanced Large Language Models (LLMs). The proposed research evaluates and compare the performance outcomes between fine-tuned BERT-based models BERT, RoBERTa, DistilBERT, and ALBERT with cutting edge Large Language Models (LLMs) LLaMA3, DeepSeek R1, and Mistral. These models are applied to the PHEME dataset, a corpus contains actual Twitter posts that have already received labels as either rumors or unverified statements or non-rumors. Data preprocessing includes cleaning tweet text, extracting engagement metrics, and user features. BERT performs best in the fine-tuned context by achieving the highest accuracy (82.9%) and the RAG-based LLMs showed less effective in the zero-shot evaluation; thus, the observed discrepancy indicates variations in training paradigms rather than inherent architectural superiority. Explainability is achieved using Local Interpretable Model-Agnostic Explanations (LIME), which visualizes influential features behind predictions. The findings highlight a trade-off between LLM flexibility and transformer precision, offering a scalable, interpretable solution for trustworthy rumor detection and content moderation.

References

A. Zubiaga, M. Liakata, R. Procter, G. Wong Sak Hoi, and P. Tolmie, “Analysing how people orient to and spread rumours in social media by looking at conversational threads,” PLoS ONE, vol. 11, no. 3, p. e0150989, 2016.

A. Bondielli and F. Marcelloni, “A survey on fake news and rumour detection techniques,” Inf. Sci., vol. 497, pp. 38–55, 2019.

J. Devlin, M. W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of deep bidirectional transformers for language understanding,” in Proc. NAACL-HLT, 2019, pp. 4171–4186.

Y. Liu et al., “RoBERTa: A robustly optimized BERT pretraining approach,” arXiv preprint arXiv:1907.11692, 2019.

Z. Lan et al., “ALBERT: A lite BERT for self-supervised learning of language representations,” arXiv preprint arXiv:1909.11942, 2019.

V. Sanh, L. Debut, J. Chaumond, and T. Wolf, “DistilBERT, a distilled version of BERT,” arXiv preprint arXiv:1910.01108, 2019.

A. Grattafiori et al., “The LLaMA 3 herd of models,” arXiv preprint arXiv:2407.21783, 2024.

D. Guo et al., “DeepSeek-R1: Incentivizing reasoning capability in LLMs via reinforcement learning,” arXiv preprint arXiv:2501.12948, 2025.

F. Jiang, “Identifying and mitigating vulnerabilities in LLM-integrated applications,” Master’s thesis, Univ. Washington, Seattle, WA, USA, 2024.

P. Lewis et al., “Retrieval-augmented generation for knowledge-intensive NLP tasks,” in Adv. Neural Inf. Process. Syst. (NeurIPS), vol. 33, 2020, pp. 9459–9474.

M. T. Ribeiro, S. Singh, and C. Guestrin, “Why should I trust you? Explaining the predictions of any classifier,” in Proc. 22nd ACM SIGKDD Int. Conf. Knowl. Discov. Data Mining (KDD), San Francisco, CA, USA, Aug. 2016, pp. 1135–1144.

K. Zhu and L. Ying, “Information source detection in the SIR model: A sample-path-based approach,” IEEE/ACM Trans. Netw., vol. 24, no. 1, pp. 408–421, Feb. 2014.

J. Jiang, S. Wen, S. Yu, Y. Xiang, and W. Zhou, “K-center: An approach on the multi-source identification of information diffusion,” IEEE Trans. Inf. Forensics Security, vol. 10, no. 12, pp. 2616–2626, Dec. 2015.

B. Malhotra and D. K. Vishwakarma, “Classification of propagation path and tweets for rumor detection using graphical convolutional networks and transformer-based encodings,” in Proc. IEEE 6th Int. Conf. Multimedia Big Data (BigMM), New Delhi, India, Sept. 2020, pp. 183–190.

C. M. M. Kotteti, X. Dong, and L. Qian, “Ensemble deep learning on time-series representation of tweets for rumor detection in social media,” Appl. Sci., vol. 10, no. 21, p. 7541, 2020.

Q. Huang, C. Zhou, J. Wu, L. Liu, and B. Wang, “Deep spatial–temporal structure learning for rumor detection on Twitter,” Neural Comput. Appl., vol. 35, no. 18, pp. 12995–13005, 2023.

H. K. Thakur, A. Gupta, A. Bhardwaj, and D. Verma, “Rumor detection on Twitter using a supervised machine learning framework,” Int. J. Inf. Retrieval Res., vol. 8, no. 3, pp. 1–13, 2018.

O. Mairaj and S. U. R. Khan, “Unveiling temporal patterns in information for improved rumor detection,” Soc. Netw. Anal. Mining, vol. 15, no. 1, p. 13, 2025.

T. Chen, H. Chen, and X. Li, “Rumor detection via recurrent neural networks: A case study on adaptivity with varied data compositions,” in Proc. Pacific-Asia Conf. Knowl. Discov. Data Mining (PAKDD), Cham, Switzerland: Springer, June 2018, pp. 121–127.

O. Ajao, D. Bhowmik, and S. Zargari, “Fake news identification on Twitter with hybrid CNN and RNN models,” in Proc. 9th Int. Conf. Social Media and Society, Copenhagen, Denmark, July 2018, pp. 226–230.

K. Zhou, C. Shu, B. Li, and J. H. Lau, “Early rumour detection,” in Proc. NAACL-HLT, Minneapolis, MN, USA, June 2019, pp. 1614–1623.

J. Ma, W. Gao, and K. F. Wong, “Detect rumors in microblog posts using propagation structure via kernel learning,” in Proc. 55th Annu. Meeting Assoc. Comput. Linguistics (ACL), 2017, pp. 708–717.

Z. Wang and Y. Guo, “Rumor events detection enhanced by encoding sentimental information into time series division and word representations,” Neurocomputing, vol. 397, pp. 224–243, 2020.

R. K. Kaliyar, A. Goswami, and P. Narang, “FakeBERT: Fake news detection in social media with a BERT-based deep learning approach,” Multimedia Tools Appl., vol. 80, no. 8, pp. 11765–11788, 2021.

R. Anggrainingsih, G. M. Hassan, and A. Datta, “BERT-based classification system for detecting rumours on Twitter,” arXiv preprint arXiv:2109.02975, 2021.

S. Sharma and R. Sharma, “Identifying possible rumor spreaders on Twitter: A weak supervised learning approach,” in Proc. Int. Joint Conf. Neural Netw. (IJCNN), July 2021, pp. 1–8.

B. Pattanaik, S. Mandal, R. M. Tripathy, and A. A. Sekh, “Rumor detection using dual embeddings and text-based graph convolutional network,” Discover Artificial Intelligence, vol. 4, no. 1, p. 86, 2024.

G. Joshi et al., “Explainable misinformation detection across multiple social media platforms,” IEEE Access, vol. 11, pp. 23634–23646, 2023.

L. M. S. Khoo, H. L. Chieu, Z. Qian, and J. Jiang, “Interpretable rumor detection in microblogs by attending to user interactions,” in Proc. AAAI Conf. Artif. Intell., vol. 34, no. 5, Apr. 2020, pp. 8783–8790.

A. Radford et al., “Language models are unsupervised multitask learners,” OpenAI Blog, 2019.

H. Touvron et al., “LLaMA: Open and efficient foundation language models,” arXiv preprint arXiv:2302.13971, 2023.

A. Zubiaga, G. Wong Sak Hoi, M. Liakata, and R. Procter, “PHEME dataset of rumours and non-rumours,” figshare, 2016, doi: 10.6084/m9.figshare.4010619.v1.

C. M. M. Kotteti, X. Dong, and L. Qian, “Ensemble deep learning on time-series representation of tweets for rumor detection in social media,” Appl. Sci., vol. 10, no. 21, p. 7541, 2020.

K. Shu, S. Wang, and H. Liu, “Beyond news contents: The role of social context for fake news detection,” in Proc. 12th ACM Int. Conf. Web Search Data Mining (WSDM), Jan. 2019, pp. 312–320.

Y. Gao et al., “Retrieval-augmented generation for large language models: A survey,” arXiv preprint arXiv:2312.10997, 2023.

Hugging Face, “sentence-transformers/all-MiniLM-L6-v2,” [Online]. Available: https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2

A. Bilal, D. Ebert, and B. Lin, “LLMs for explainable AI: A comprehensive survey,” arXiv preprint arXiv:2504.00125, 2025.

R. Anggrainingsih, G. M. Hassan, and A. Datta, “Evaluating BERT-based language models for detecting misinformation,” Neural Comput. Appl., vol. 37, no. 16, pp. 9937–9968, 2025.

R. Anggrainingsih, G. M. Hassan, and A. Datta, “Evaluating BERT-based pre-training language models for detecting misinformation,” arXiv preprint arXiv:2203.07731, 2022.

BytePlus, “DeepSeek-R1 vs LLaMA 3 for RAG: A detailed comparison,” 2025.

Chitika, “Can DeepSeek’s SLM approach replace traditional RAG?” 2025.

Elephas, “Mistral 7B vs DeepSeek R1 performance: Which LLM is the better choice?” 2025.

A. Khraisat, M. Chang, L. Chang, and J. Abawajy, “Survey on deep learning for misinformation detection: Adapting to recent events, multilingual challenges, and future visions,” Social Science Computer Review, 2025.

Deakin University Digital Repository, “Cross-domain performance of RAG models in fact-checking and social media analysis,” 2025.

Explainable BERT Models for Rumor Detection on Social Media

Authors

DOI:

Abstract

References

Downloads

Published

How to Cite

Issue

Section

License

Information

ISSN

Scopus Metrics

SCImago

Scopus CiteScore

Make a Submission