Emotion Detection and Analysis using Textual Data through Trainable and Pre-trained Word Embedding Methods

Authors

DOI:

https://doi.org/10.21015/vtse.v13i2.2115

Abstract

Emotion expression modes play a significant role in human communication. Humans use emotions to convey their state of mind to each other on platforms such as X (formerly Twitter), Facebook, and other online social networks. People often express their emotions using free text, which triggers a vast research area of emotion detection and analysis. This work aims to detect and analyze emotions from unstructured text data. For this purpose, this research study proposes a solution to the problem by building a deep artificial neural network model using trainable and pre-trained word embedding methods. Afterward, the performance of the models developed with different word embeddings is evaluated using the performance metrics. Experimental works demonstrate that the deep artificial neural network with trainable word embedding surpassed all other models by achieving 67.36% accuracy, 53.27% recall, 82.62% precision, and 64.50% F-measure.

References

S. Harguem, Z. Chabani, S. Noaman, M. Amjad, M. B. Alvi, M. Asif, M. H. Mehmood, and A. H. Al-kassem, "Machine learning based prediction of stock exchange on NASDAQ 100: A Twitter mining approach," in Proc. Int. Conf. Cyber Resilience (ICCR), IEEE, 2022, pp. 01–10.

K. Jawad, M. Ahmad, M. Alvi, and M. B. Alvi, "RUSAS: Roman Urdu sentiment analysis system," Computers, Materials & Continua, vol. 79, no. 1, 2024.

P. Ekman, "An argument for basic emotions," Cognition & Emotion, vol. 6, no. 3–4, pp. 169–200, 1992. DOI: https://doi.org/10.1080/02699939208411068

R. Plutchik, "A general psychoevolutionary theory of emotion," in Theories of Emotion, Elsevier, 1980, pp. 3–33. DOI: https://doi.org/10.1016/B978-0-12-558701-3.50007-7

A. Ortony, G. L. Clore, and A. Collins, The Cognitive Structure of Emotions, Cambridge University Press, 1990.

W. G. Parrott, Emotions in Social Psychology: Essential Readings, Psychology Press, 2001.

J. A. Russell, "A circumplex model of affect," Journal of Personality and Social Psychology, vol. 39, no. 6, p. 1161, 1980. DOI: https://doi.org/10.1037/h0077714

J. A. Russell and A. Mehrabian, "Evidence for a three-factor theory of emotions," Journal of Research in Personality, vol. 11, no. 3, pp. 273–294, 1977. DOI: https://doi.org/10.1016/0092-6566(77)90037-X

N. Fatima, M. Alvi, and M. Alvi, "Empowering sentiment analysis with deep learning model: Evaluating social media’s benefits and drawbacks," VAWKUM Transactions on Computer Sciences, vol. 12, pp. 285–297, Dec. 2024.

A. Majeed, H. Mujtaba, and M. O. Beg, "Emotion detection in Roman Urdu text using machine learning," in Proc. 35th IEEE/ACM Int. Conf. Automated Software Engineering Workshops, 2020, pp. 125–130.

M. Suhasini and B. Srinivasu, "Emotion detection framework for Twitter data using supervised classifiers," in Data Engineering and Communication Technology, Springer, 2020, pp. 565–576.

A. F. A. Nasir et al., "Text-based emotion prediction system using machine learning approach," in IOP Conf. Series: Materials Science and Engineering, vol. 769, no. 1, p. 012022, 2020.

N. A. S. Winarsih, C. Supriyanto et al., "Evaluation of classification methods for Indonesian text emotion detection," in Proc. Int. Seminar on Application for Technology of Information and Communication (ISemantic), IEEE, 2016, pp. 130–133. DOI: https://doi.org/10.1109/ISEMANTIC.2016.7873824

X. Zhang and X. Zheng, "Comparison of text sentiment analysis based on machine learning," in Proc. 15th Int. Symp. Parallel and Distributed Computing (ISPDC), IEEE, 2016, pp. 230–233. DOI: https://doi.org/10.1109/ISPDC.2016.39

J. Zheng, "A novel computer-aided emotion recognition of text method based on word embedding and Bi-LSTM," in Proc. Int. Conf. Artificial Intelligence and Advanced Manufacturing (AIAM), IEEE, 2019, pp. 176–180.

J. Filipczuk, N. F. Capece, S. Senatore, and U. Erra, "A preliminary investigation of deep emotion-based classification from natural language text," in Proc. IEEE Int. Conf. Systems, Man and Cybernetics (SMC), 2019, pp. 3832–3839.

M. A. Tocoglu, O. Ozturkmenoglu, and A. Alpkocak, "Emotion analysis from Turkish tweets using deep neural networks," IEEE Access, vol. 7, pp. 183061–183069, 2019.

E. Nugraheni, "Indonesian Twitter data pre-processing for the emotion recognition," in Proc. Int. Seminar on Research of Information Technology and Intelligent Systems (ISRITI), IEEE, 2019, pp. 58–63.

D. Xu et al., "Deep learning based emotion analysis of microblog texts," Information Fusion, vol. 64, pp. 1–11, 2020.

M. Karna, D. S. Juliet, and R. C. Joy, "Deep learning based text emotion recognition for chatbot applications," in Proc. 4th Int. Conf. Trends in Electronics and Informatics (ICOEI), IEEE, 2020, pp. 988–993.

M. B. Alvi et al., "An effective framework for tweet level sentiment classification using recursive text pre-processing approach," Int. J. Adv. Comput. Sci. Appl., vol. 10, no. 6, 2019.

M. S. Imrona, Widyawan, and L. E. Nugroho, "Pre-processing task for classifying satire in Indonesian news headline," in Proc. 3rd Int. Conf. Information and Communications Technology (ICOIACT), 2020, pp. 176–179.

J. Camacho-Collados and M. T. Pilehvar, "On the role of text preprocessing in neural network architectures," arXiv preprint, arXiv:1707.01780, 2017.

J. Joseph and D. Jeba, "Information extraction using tokenization and clustering methods," Int. J. Recent Technology and Engineering (IJRTE), vol. 8, pp. 3690–3692, Nov. 2019.

M. B. Alvi et al., "Count me too: Sentiment analysis of Roman Sindhi script," SAGE Open, vol. 13, no. 3, p. 21582440231197452, 2023.

D. Khyani et al., "An interpretation of lemmatization and stemming in natural language processing," Journal of University of Shanghai for Science and Technology, vol. 22, pp. 350–357, Jan. 2021.

L. A. Sevastyanov and E. Y. Shchetinin, "On methods for improving the accuracy of multi-class classification on imbalanced data," in Proc. ITTMM, 2020, pp. 70–82.

K. Fujiwara et al., "Over- and under-sampling approach for extremely imbalanced and small minority data problem in health record analysis," Frontiers in Public Health, vol. 8, p. 178, 2020.

R. Alejo, V. García, and J. H. Pacheco-Sánchez, "An efficient over-sampling approach based on mean square error back-propagation for dealing with the multi-class imbalance problem," Neural Processing Letters, vol. 42, no. 3, pp. 603–617, 2015. DOI: https://doi.org/10.1007/s11063-014-9376-3

M.-E. Brunet et al., "Understanding the origins of bias in word embeddings," in Proc. 36th Int. Conf. Machine Learning, PMLR, 2019, pp. 803–811.

T. Mikolov et al., "Efficient estimation of word representations in vector space," arXiv preprint, arXiv:1301.3781, 2013.

C. Allen and T. Hospedales, "Analogies explained: Towards understanding word embeddings," in Int. Conf. Machine Learning, PMLR, 2019, pp. 223–231.

J. Pennington, R. Socher, and C. Manning, "GloVe: Global vectors for word representation," in Proc. 2014 Conf. Empirical Methods in Natural Language Processing (EMNLP), Doha, Qatar, 2014, pp. 1532–1543. DOI: https://doi.org/10.3115/v1/D14-1162

A. Jain, J. Mao, and K. Mohiuddin, "Artificial neural networks: A tutorial," Computer, vol. 29, no. 3, pp. 31–44, 1996. DOI: https://doi.org/10.1109/2.485891

S. Walczak and N. Cerpa, "Artificial neural networks," in Encyclopedia of Physical Science and Technology, 3rd ed., Academic Press, 2003, pp. 631–645. DOI: https://doi.org/10.1016/B0-12-227410-5/00837-1

R. Qamar and B. Zardari, "Artificial neural networks: An overview," Mesopotamian Journal of Computer Science, vol. 2023, pp. 130–139, Aug. 2023.

M. Roshan et al., "Linguistic based emotion analysis using softmax over time attention mechanism," PLOS One, vol. 19, no. 4, p. e0301336, 2024.

Y. Ho and S. Wookey, "The real-world-weight cross-entropy loss function: Modeling the costs of mislabeling," IEEE Access, vol. 8, pp. 4806–4813, 2019.

L. Feng et al., "Can cross entropy loss be robust to label noise?" in Proc. IJCAI, 2020, pp. 2206–2212.

A. Rusiecki, "Trimmed categorical cross-entropy for deep learning with label noise," Electronics Letters, vol. 55, no. 6, pp. 319–320, 2019.

A. Jain, A. Fandango, and A. Kapoor, TensorFlow Machine Learning Projects, Packt Publishing, 2018.

C. Zhang et al., "Gradient descent optimization in deep learning model training based on multistage and method combination strategy," Security and Communication Networks, vol. 2021, 2021.

J. Hernández-Orallo, P. Flach, and C. Ferri Ramírez, "A unified view of performance metrics," Journal of Machine Learning Research, vol. 13, pp. 2813–2869, 2012.

Y. Liu et al., "A strategy on selecting performance metrics for classifier evaluation," Int. J. Mobile Computing and Multimedia Communications (IJMCMC), vol. 6, no. 4, pp. 20–35, 2014. DOI: https://doi.org/10.4018/IJMCMC.2014100102

Y. K. Shyang and J. L. S. Yan, "A text augmentation approach using similarity measures based on neural sentence embeddings," in Proc. IEEE 2nd Int. Conf. Artificial Intelligence in Engineering and Technology (IICAIET), 2020, pp. 1–6.

Downloads

Published

2025-05-03

How to Cite

Alvi, M., Akhter, A., Alvi, M. B., & Fatima, N. (2025). Emotion Detection and Analysis using Textual Data through Trainable and Pre-trained Word Embedding Methods. VFAST Transactions on Software Engineering, 13(2), 28–43. https://doi.org/10.21015/vtse.v13i2.2115

Issue

Section

Articles