Emotion Detection and Analysis using Textual Data through Trainable and Pre-trained Word Embedding Methods
DOI:
https://doi.org/10.21015/vtse.v13i2.2115Abstract
Emotion expression modes play a significant role in human communication. Humans use emotions to convey their state of mind to each other on platforms such as X (formerly Twitter), Facebook, and other online social networks. People often express their emotions using free text, which triggers a vast research area of emotion detection and analysis. This work aims to detect and analyze emotions from unstructured text data. For this purpose, this research study proposes a solution to the problem by building a deep artificial neural network model using trainable and pre-trained word embedding methods. Afterward, the performance of the models developed with different word embeddings is evaluated using the performance metrics. Experimental works demonstrate that the deep artificial neural network with trainable word embedding surpassed all other models by achieving 67.36% accuracy, 53.27% recall, 82.62% precision, and 64.50% F-measure.
References
S. Harguem, Z. Chabani, S. Noaman, M. Amjad, M. B. Alvi, M. Asif, M. H. Mehmood, and A. H. Al-kassem, "Machine learning based prediction of stock exchange on NASDAQ 100: A Twitter mining approach," in Proc. Int. Conf. Cyber Resilience (ICCR), IEEE, 2022, pp. 01–10.
K. Jawad, M. Ahmad, M. Alvi, and M. B. Alvi, "RUSAS: Roman Urdu sentiment analysis system," Computers, Materials & Continua, vol. 79, no. 1, 2024.
P. Ekman, "An argument for basic emotions," Cognition & Emotion, vol. 6, no. 3–4, pp. 169–200, 1992. DOI: https://doi.org/10.1080/02699939208411068
R. Plutchik, "A general psychoevolutionary theory of emotion," in Theories of Emotion, Elsevier, 1980, pp. 3–33. DOI: https://doi.org/10.1016/B978-0-12-558701-3.50007-7
A. Ortony, G. L. Clore, and A. Collins, The Cognitive Structure of Emotions, Cambridge University Press, 1990.
W. G. Parrott, Emotions in Social Psychology: Essential Readings, Psychology Press, 2001.
J. A. Russell, "A circumplex model of affect," Journal of Personality and Social Psychology, vol. 39, no. 6, p. 1161, 1980. DOI: https://doi.org/10.1037/h0077714
J. A. Russell and A. Mehrabian, "Evidence for a three-factor theory of emotions," Journal of Research in Personality, vol. 11, no. 3, pp. 273–294, 1977. DOI: https://doi.org/10.1016/0092-6566(77)90037-X
N. Fatima, M. Alvi, and M. Alvi, "Empowering sentiment analysis with deep learning model: Evaluating social media’s benefits and drawbacks," VAWKUM Transactions on Computer Sciences, vol. 12, pp. 285–297, Dec. 2024.
A. Majeed, H. Mujtaba, and M. O. Beg, "Emotion detection in Roman Urdu text using machine learning," in Proc. 35th IEEE/ACM Int. Conf. Automated Software Engineering Workshops, 2020, pp. 125–130.
M. Suhasini and B. Srinivasu, "Emotion detection framework for Twitter data using supervised classifiers," in Data Engineering and Communication Technology, Springer, 2020, pp. 565–576.
A. F. A. Nasir et al., "Text-based emotion prediction system using machine learning approach," in IOP Conf. Series: Materials Science and Engineering, vol. 769, no. 1, p. 012022, 2020.
N. A. S. Winarsih, C. Supriyanto et al., "Evaluation of classification methods for Indonesian text emotion detection," in Proc. Int. Seminar on Application for Technology of Information and Communication (ISemantic), IEEE, 2016, pp. 130–133. DOI: https://doi.org/10.1109/ISEMANTIC.2016.7873824
X. Zhang and X. Zheng, "Comparison of text sentiment analysis based on machine learning," in Proc. 15th Int. Symp. Parallel and Distributed Computing (ISPDC), IEEE, 2016, pp. 230–233. DOI: https://doi.org/10.1109/ISPDC.2016.39
J. Zheng, "A novel computer-aided emotion recognition of text method based on word embedding and Bi-LSTM," in Proc. Int. Conf. Artificial Intelligence and Advanced Manufacturing (AIAM), IEEE, 2019, pp. 176–180.
J. Filipczuk, N. F. Capece, S. Senatore, and U. Erra, "A preliminary investigation of deep emotion-based classification from natural language text," in Proc. IEEE Int. Conf. Systems, Man and Cybernetics (SMC), 2019, pp. 3832–3839.
M. A. Tocoglu, O. Ozturkmenoglu, and A. Alpkocak, "Emotion analysis from Turkish tweets using deep neural networks," IEEE Access, vol. 7, pp. 183061–183069, 2019.
E. Nugraheni, "Indonesian Twitter data pre-processing for the emotion recognition," in Proc. Int. Seminar on Research of Information Technology and Intelligent Systems (ISRITI), IEEE, 2019, pp. 58–63.
D. Xu et al., "Deep learning based emotion analysis of microblog texts," Information Fusion, vol. 64, pp. 1–11, 2020.
M. Karna, D. S. Juliet, and R. C. Joy, "Deep learning based text emotion recognition for chatbot applications," in Proc. 4th Int. Conf. Trends in Electronics and Informatics (ICOEI), IEEE, 2020, pp. 988–993.
M. B. Alvi et al., "An effective framework for tweet level sentiment classification using recursive text pre-processing approach," Int. J. Adv. Comput. Sci. Appl., vol. 10, no. 6, 2019.
M. S. Imrona, Widyawan, and L. E. Nugroho, "Pre-processing task for classifying satire in Indonesian news headline," in Proc. 3rd Int. Conf. Information and Communications Technology (ICOIACT), 2020, pp. 176–179.
J. Camacho-Collados and M. T. Pilehvar, "On the role of text preprocessing in neural network architectures," arXiv preprint, arXiv:1707.01780, 2017.
J. Joseph and D. Jeba, "Information extraction using tokenization and clustering methods," Int. J. Recent Technology and Engineering (IJRTE), vol. 8, pp. 3690–3692, Nov. 2019.
M. B. Alvi et al., "Count me too: Sentiment analysis of Roman Sindhi script," SAGE Open, vol. 13, no. 3, p. 21582440231197452, 2023.
D. Khyani et al., "An interpretation of lemmatization and stemming in natural language processing," Journal of University of Shanghai for Science and Technology, vol. 22, pp. 350–357, Jan. 2021.
L. A. Sevastyanov and E. Y. Shchetinin, "On methods for improving the accuracy of multi-class classification on imbalanced data," in Proc. ITTMM, 2020, pp. 70–82.
K. Fujiwara et al., "Over- and under-sampling approach for extremely imbalanced and small minority data problem in health record analysis," Frontiers in Public Health, vol. 8, p. 178, 2020.
R. Alejo, V. García, and J. H. Pacheco-Sánchez, "An efficient over-sampling approach based on mean square error back-propagation for dealing with the multi-class imbalance problem," Neural Processing Letters, vol. 42, no. 3, pp. 603–617, 2015. DOI: https://doi.org/10.1007/s11063-014-9376-3
M.-E. Brunet et al., "Understanding the origins of bias in word embeddings," in Proc. 36th Int. Conf. Machine Learning, PMLR, 2019, pp. 803–811.
T. Mikolov et al., "Efficient estimation of word representations in vector space," arXiv preprint, arXiv:1301.3781, 2013.
C. Allen and T. Hospedales, "Analogies explained: Towards understanding word embeddings," in Int. Conf. Machine Learning, PMLR, 2019, pp. 223–231.
J. Pennington, R. Socher, and C. Manning, "GloVe: Global vectors for word representation," in Proc. 2014 Conf. Empirical Methods in Natural Language Processing (EMNLP), Doha, Qatar, 2014, pp. 1532–1543. DOI: https://doi.org/10.3115/v1/D14-1162
A. Jain, J. Mao, and K. Mohiuddin, "Artificial neural networks: A tutorial," Computer, vol. 29, no. 3, pp. 31–44, 1996. DOI: https://doi.org/10.1109/2.485891
S. Walczak and N. Cerpa, "Artificial neural networks," in Encyclopedia of Physical Science and Technology, 3rd ed., Academic Press, 2003, pp. 631–645. DOI: https://doi.org/10.1016/B0-12-227410-5/00837-1
R. Qamar and B. Zardari, "Artificial neural networks: An overview," Mesopotamian Journal of Computer Science, vol. 2023, pp. 130–139, Aug. 2023.
M. Roshan et al., "Linguistic based emotion analysis using softmax over time attention mechanism," PLOS One, vol. 19, no. 4, p. e0301336, 2024.
Y. Ho and S. Wookey, "The real-world-weight cross-entropy loss function: Modeling the costs of mislabeling," IEEE Access, vol. 8, pp. 4806–4813, 2019.
L. Feng et al., "Can cross entropy loss be robust to label noise?" in Proc. IJCAI, 2020, pp. 2206–2212.
A. Rusiecki, "Trimmed categorical cross-entropy for deep learning with label noise," Electronics Letters, vol. 55, no. 6, pp. 319–320, 2019.
A. Jain, A. Fandango, and A. Kapoor, TensorFlow Machine Learning Projects, Packt Publishing, 2018.
C. Zhang et al., "Gradient descent optimization in deep learning model training based on multistage and method combination strategy," Security and Communication Networks, vol. 2021, 2021.
J. Hernández-Orallo, P. Flach, and C. Ferri Ramírez, "A unified view of performance metrics," Journal of Machine Learning Research, vol. 13, pp. 2813–2869, 2012.
Y. Liu et al., "A strategy on selecting performance metrics for classifier evaluation," Int. J. Mobile Computing and Multimedia Communications (IJMCMC), vol. 6, no. 4, pp. 20–35, 2014. DOI: https://doi.org/10.4018/IJMCMC.2014100102
Y. K. Shyang and J. L. S. Yan, "A text augmentation approach using similarity measures based on neural sentence embeddings," in Proc. IEEE 2nd Int. Conf. Artificial Intelligence in Engineering and Technology (IICAIET), 2020, pp. 1–6.
Downloads
Published
How to Cite
Issue
Section
License
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License (CC-By) that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).
This work is licensed under a Creative Commons Attribution License CC BY