Explainable BERT Models for Rumor Detection on Social Media
DOI:
https://doi.org/10.21015/vtse.v14i1.2275Abstract
The rapid spread of unverified information and rumors across social media platforms poses serious risks to public health, economic stability, and societal trust. Researchers developed specific machine learning and deep learning models for automated rumor detection because of the increasing challenge that these platforms posed. The existing automated systems operate through uninterpretable black box operations thus users find it hard to trust these systems or understand how they work. This study addresses these limitations by developing explainable framework for rumor detection through transformer-based models along with advanced Large Language Models (LLMs). The proposed research evaluates and compare the performance outcomes between fine-tuned BERT-based models BERT, RoBERTa, DistilBERT, and ALBERT with cutting edge Large Language Models (LLMs) LLaMA3, DeepSeek R1, and Mistral. These models are applied to the PHEME dataset, a corpus contains actual Twitter posts that have already received labels as either rumors or unverified statements or non-rumors. Data preprocessing includes cleaning tweet text, extracting engagement metrics, and user features. BERT performs best in the fine-tuned context by achieving the highest accuracy (82.9%) and the RAG-based LLMs showed less effective in the zero-shot evaluation; thus, the observed discrepancy indicates variations in training paradigms rather than inherent architectural superiority. Explainability is achieved using Local Interpretable Model-Agnostic Explanations (LIME), which visualizes influential features behind predictions. The findings highlight a trade-off between LLM flexibility and transformer precision, offering a scalable, interpretable solution for trustworthy rumor detection and content moderation.
References
A. Zubiaga, M. Liakata, R. Procter, G. Wong Sak Hoi, and P. Tolmie, “Analysing how people orient to and spread rumours in social media by looking at conversational threads,” PLoS ONE, vol. 11, no. 3, p. e0150989, 2016.
A. Bondielli and F. Marcelloni, “A survey on fake news and rumour detection techniques,” Inf. Sci., vol. 497, pp. 38–55, 2019.
J. Devlin, M. W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of deep bidirectional transformers for language understanding,” in Proc. NAACL-HLT, 2019, pp. 4171–4186.
Y. Liu et al., “RoBERTa: A robustly optimized BERT pretraining approach,” arXiv preprint arXiv:1907.11692, 2019.
Z. Lan et al., “ALBERT: A lite BERT for self-supervised learning of language representations,” arXiv preprint arXiv:1909.11942, 2019.
V. Sanh, L. Debut, J. Chaumond, and T. Wolf, “DistilBERT, a distilled version of BERT,” arXiv preprint arXiv:1910.01108, 2019.
A. Grattafiori et al., “The LLaMA 3 herd of models,” arXiv preprint arXiv:2407.21783, 2024.
D. Guo et al., “DeepSeek-R1: Incentivizing reasoning capability in LLMs via reinforcement learning,” arXiv preprint arXiv:2501.12948, 2025.
F. Jiang, “Identifying and mitigating vulnerabilities in LLM-integrated applications,” Master’s thesis, Univ. Washington, Seattle, WA, USA, 2024.
P. Lewis et al., “Retrieval-augmented generation for knowledge-intensive NLP tasks,” in Adv. Neural Inf. Process. Syst. (NeurIPS), vol. 33, 2020, pp. 9459–9474.
M. T. Ribeiro, S. Singh, and C. Guestrin, “Why should I trust you? Explaining the predictions of any classifier,” in Proc. 22nd ACM SIGKDD Int. Conf. Knowl. Discov. Data Mining (KDD), San Francisco, CA, USA, Aug. 2016, pp. 1135–1144.
K. Zhu and L. Ying, “Information source detection in the SIR model: A sample-path-based approach,” IEEE/ACM Trans. Netw., vol. 24, no. 1, pp. 408–421, Feb. 2014.
J. Jiang, S. Wen, S. Yu, Y. Xiang, and W. Zhou, “K-center: An approach on the multi-source identification of information diffusion,” IEEE Trans. Inf. Forensics Security, vol. 10, no. 12, pp. 2616–2626, Dec. 2015.
B. Malhotra and D. K. Vishwakarma, “Classification of propagation path and tweets for rumor detection using graphical convolutional networks and transformer-based encodings,” in Proc. IEEE 6th Int. Conf. Multimedia Big Data (BigMM), New Delhi, India, Sept. 2020, pp. 183–190.
C. M. M. Kotteti, X. Dong, and L. Qian, “Ensemble deep learning on time-series representation of tweets for rumor detection in social media,” Appl. Sci., vol. 10, no. 21, p. 7541, 2020.
Q. Huang, C. Zhou, J. Wu, L. Liu, and B. Wang, “Deep spatial–temporal structure learning for rumor detection on Twitter,” Neural Comput. Appl., vol. 35, no. 18, pp. 12995–13005, 2023.
H. K. Thakur, A. Gupta, A. Bhardwaj, and D. Verma, “Rumor detection on Twitter using a supervised machine learning framework,” Int. J. Inf. Retrieval Res., vol. 8, no. 3, pp. 1–13, 2018.
O. Mairaj and S. U. R. Khan, “Unveiling temporal patterns in information for improved rumor detection,” Soc. Netw. Anal. Mining, vol. 15, no. 1, p. 13, 2025.
T. Chen, H. Chen, and X. Li, “Rumor detection via recurrent neural networks: A case study on adaptivity with varied data compositions,” in Proc. Pacific-Asia Conf. Knowl. Discov. Data Mining (PAKDD), Cham, Switzerland: Springer, June 2018, pp. 121–127.
O. Ajao, D. Bhowmik, and S. Zargari, “Fake news identification on Twitter with hybrid CNN and RNN models,” in Proc. 9th Int. Conf. Social Media and Society, Copenhagen, Denmark, July 2018, pp. 226–230.
K. Zhou, C. Shu, B. Li, and J. H. Lau, “Early rumour detection,” in Proc. NAACL-HLT, Minneapolis, MN, USA, June 2019, pp. 1614–1623.
J. Ma, W. Gao, and K. F. Wong, “Detect rumors in microblog posts using propagation structure via kernel learning,” in Proc. 55th Annu. Meeting Assoc. Comput. Linguistics (ACL), 2017, pp. 708–717.
Z. Wang and Y. Guo, “Rumor events detection enhanced by encoding sentimental information into time series division and word representations,” Neurocomputing, vol. 397, pp. 224–243, 2020.
R. K. Kaliyar, A. Goswami, and P. Narang, “FakeBERT: Fake news detection in social media with a BERT-based deep learning approach,” Multimedia Tools Appl., vol. 80, no. 8, pp. 11765–11788, 2021.
R. Anggrainingsih, G. M. Hassan, and A. Datta, “BERT-based classification system for detecting rumours on Twitter,” arXiv preprint arXiv:2109.02975, 2021.
S. Sharma and R. Sharma, “Identifying possible rumor spreaders on Twitter: A weak supervised learning approach,” in Proc. Int. Joint Conf. Neural Netw. (IJCNN), July 2021, pp. 1–8.
B. Pattanaik, S. Mandal, R. M. Tripathy, and A. A. Sekh, “Rumor detection using dual embeddings and text-based graph convolutional network,” Discover Artificial Intelligence, vol. 4, no. 1, p. 86, 2024.
G. Joshi et al., “Explainable misinformation detection across multiple social media platforms,” IEEE Access, vol. 11, pp. 23634–23646, 2023.
L. M. S. Khoo, H. L. Chieu, Z. Qian, and J. Jiang, “Interpretable rumor detection in microblogs by attending to user interactions,” in Proc. AAAI Conf. Artif. Intell., vol. 34, no. 5, Apr. 2020, pp. 8783–8790.
A. Radford et al., “Language models are unsupervised multitask learners,” OpenAI Blog, 2019.
H. Touvron et al., “LLaMA: Open and efficient foundation language models,” arXiv preprint arXiv:2302.13971, 2023.
A. Zubiaga, G. Wong Sak Hoi, M. Liakata, and R. Procter, “PHEME dataset of rumours and non-rumours,” figshare, 2016, doi: 10.6084/m9.figshare.4010619.v1.
C. M. M. Kotteti, X. Dong, and L. Qian, “Ensemble deep learning on time-series representation of tweets for rumor detection in social media,” Appl. Sci., vol. 10, no. 21, p. 7541, 2020.
K. Shu, S. Wang, and H. Liu, “Beyond news contents: The role of social context for fake news detection,” in Proc. 12th ACM Int. Conf. Web Search Data Mining (WSDM), Jan. 2019, pp. 312–320.
Y. Gao et al., “Retrieval-augmented generation for large language models: A survey,” arXiv preprint arXiv:2312.10997, 2023.
Hugging Face, “sentence-transformers/all-MiniLM-L6-v2,” [Online]. Available: https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2
A. Bilal, D. Ebert, and B. Lin, “LLMs for explainable AI: A comprehensive survey,” arXiv preprint arXiv:2504.00125, 2025.
R. Anggrainingsih, G. M. Hassan, and A. Datta, “Evaluating BERT-based language models for detecting misinformation,” Neural Comput. Appl., vol. 37, no. 16, pp. 9937–9968, 2025.
R. Anggrainingsih, G. M. Hassan, and A. Datta, “Evaluating BERT-based pre-training language models for detecting misinformation,” arXiv preprint arXiv:2203.07731, 2022.
BytePlus, “DeepSeek-R1 vs LLaMA 3 for RAG: A detailed comparison,” 2025.
Chitika, “Can DeepSeek’s SLM approach replace traditional RAG?” 2025.
Elephas, “Mistral 7B vs DeepSeek R1 performance: Which LLM is the better choice?” 2025.
A. Khraisat, M. Chang, L. Chang, and J. Abawajy, “Survey on deep learning for misinformation detection: Adapting to recent events, multilingual challenges, and future visions,” Social Science Computer Review, 2025.
Deakin University Digital Repository, “Cross-domain performance of RAG models in fact-checking and social media analysis,” 2025.
Downloads
Published
How to Cite
Issue
Section
License
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License (CC-By) that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).
This work is licensed under a Creative Commons Attribution License CC BY