Enhancing Interpretability in Anxiety Detection on Reddit: A Machine Learning Approach with LIME and Topic Modeling
DOI:
https://doi.org/10.21015/vtse.v13i2.2139Abstract
In modern society, mental disorders, particularly anxiety, are becoming more and more prevalent concerns. Individuals express their opinions and feelings on social media platforms like Reddit which offers valuable information for understanding mental health. This study applies BERTopic and Local Interpretable Model-agnostic Explanations (LIME) to demonstrate the interpretation of machine learning models in anxiety detection. To analyze and identify the linguistic patterns, a novel dataset has been collected from Reddit communities utilizing multiple subreddits pertaining to anxiety and casual conversations. For topic modeling BERTopic was used to discover key topics in discussions. In addition, TF-IDF features were used to train a Random Forest Classifier, which obtained an accuracy of 88% in classifying the post between anxiety and non-anxiety. Furthermore, to ensure transparency in model decision making process, LIME was used to examine textual features that influence models. This study emphasizes the importance of explainability with regards to AI-assisted mental health solutions while also demonstrating the usefulness of social media data in analyzing how anxiety is articulated, and language is employed differently.
References
A. Monreale, B. Iavarone, E. Rossetto, and A. Beretta, "Detecting addiction, anxiety, and depression by users psychometric profiles," in *Companion Proc. Web Conf.*, pp. 1189–1197, 2022.
T. Zhang, A. M. Schoene, S. Ji, and S. Ananiadou, "Natural language processing applied to mental illness detection: a narrative review," *NPJ Digit. Med.*, vol. 5, no. 1, p. 46, 2022.
H. E. Skallevold, N. Rokaya, N. Wongsirichat, and D. Rokaya, "Importance of oral health in mental health disorders: An updated review," *J. Oral Biol. Craniofac. Res.*, vol. 13, no. 5, pp. 544–552, 2023.
J. H. Shen and F. Rudzicz, "Detecting anxiety through reddit," in *Proc. 4th Workshop Comput. Linguistics Clin. Psychol.—From Linguistic Signal to Clin. Reality*, pp. 58–65, 2017.
O. Remes, C. Brayne, R. Van Der Linde, and L. Lafortune, "A systematic review of reviews on the prevalence of anxiety disorders in adult populations," *Brain Behav.*, vol. 6, no. 7, p. e00497, 2016.
J. M. De Lijster et al., "The age of onset of anxiety disorders: a meta-analysis," *Can. J. Psychiatry*, vol. 62, no. 4, p. 237, 2016.
N. S. Kamarudin, G. Beigi, and H. Liu, "A study on mental health discussion through reddit," in *Proc. ICSECS-ICOCSIM*, pp. 637–643, IEEE, 2021.
R. A. Calvo, D. N. Milne, M. S. Hussain, and H. Christensen, "Natural language processing in mental health applications using non-clinical texts," *Nat. Lang. Eng.*, vol. 23, no. 5, pp. 649–685, 2017.
B. S. Satpute, W. P. Rahane, and R. Bharati, "Examining social media posts for identification of anxiety and depression utilizing machine learning techniques," in *Proc. 3rd Int. Conf. Technol. Adv. Comput. Sci. (ICTACS)*, pp. 295–300, IEEE, 2023.
S. Inamdar, R. Chapekar, S. Gite, and B. Pradhan, "Machine learning driven mental stress detection on reddit posts using natural language processing," *Hum.-Centric Intell. Syst.*, vol. 3, no. 2, pp. 80–91, 2023.
J. L. Imbwaga, N. B. Chittaragi, and S. G. Koolagudi, "Explainable hate speech detection using LIME," *Int. J. Speech Technol.*, vol. 27, no. 3, pp. 793–815, 2024.
K. Rosamma and K. Rosamma Jr, "Analyzing online conversations on reddit: A study of stress and anxiety through topic modeling and sentiment analysis," *Cureus*, vol. 16, no. 9, 2024.
K. Sampath and T. Durairaj, "Data set creation and empirical analysis for detecting signs of depression from social media postings," in *Int. Conf. Comput. Intell. Data Sci.*, pp. 136–151, Springer, 2022.
I. Vayansky and S. A. Kumar, "A review of topic modeling methods," *Inf. Syst.*, vol. 94, p. 101582, 2020.
M. Grootendorst, "BERTopic: Neural topic modeling with a class-based tf-idf procedure," *arXiv preprint*, arXiv:2203.05794, 2022.
S. Xu, "Bayesian naïve Bayes classifiers to text classification," *J. Inf. Sci.*, vol. 44, no. 1, pp. 48–59, 2018.
H. Nakahara, A. Jinguji, S. Sato, and T. Sasao, "A random forest using a multi-valued decision diagram on an FPGA," in *Proc. 47th Int. Symp. Mult.-Valued Logic (ISMVL)*, pp. 266–271, IEEE, 2017.
Y. Jia, J. Bailey, K. Ramamohanarao, C. Leckie, and X. Ma, "Exploiting patterns to explain individual predictions," *Knowl. Inf. Syst.*, vol. 62, pp. 927–950, 2020.
M. T. Ribeiro, S. Singh, and C. Guestrin, "‘Why should I trust you?’ Explaining the predictions of any classifier," in *Proc. 22nd ACM SIGKDD Int. Conf. Knowl. Discovery Data Mining*, pp. 1135–1144, 2016.
A. Adak, B. Pradhan, N. Shukla, and A. Alamri, "Unboxing deep learning model of food delivery service reviews using explainable artificial intelligence (XAI) technique," *Foods*, vol. 11, no. 14, p. 2019, 2022.
S. Sathyanarayanan and B. R. Tantri, "Confusion matrix-based performance evaluation metrics," *Afr. J. Biomed. Res.*, pp. 4023–4031, 2024.
Downloads
Published
How to Cite
Issue
Section
License
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License (CC-By) that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).
This work is licensed under a Creative Commons Attribution License CC BY