A Fine Grained Sentiment Analysis of Arabic Language

Authors

DOI:

https://doi.org/10.21015/vtcs.v12i2.1926

Abstract

This work focuses on fine-grained sentiment analysis of Arabic text using recent Natural Language Processing methods. Arabic is a language rich in variation, spoken by over 400 million people, yet there is a significant lack of resources for sentiment analysis. To address these challenges, this study employs AraBERT, a model specifically fine-tuned for Arabic text. A corpus of one hundred thousand Arabic reviews across categories such as hotels, books, and movies was scraped and cleaned. These reviews were then categorized into positive, negative, and mixed sentiments. AraBERT was compared with traditional machine learning methods, including Logistic Regression, Decision Tree, Naïve Bayes, and Random Forest. AraBERT achieved superior accuracy of 88\%, along with higher precision, recall, and F1 scores for both positive and negative sentiment classes compared to the other models. This work demonstrates that AraBERT effectively analyzes the syntactic and semantic structure of Arabic, making it a valuable tool for Arabic sentiment analysis across various applications. Future work will extend the model to handle neutral sentiments and include additional dialects to further improve its performance.

References

O. Alsemaree, A. S. Alam, S. S. Gill, and S. Uhlig, “Sentiment analysis of arabic social media texts: A machine learning approach to deciphering customer perceptions,” Heliyon, vol. 10, no. 9, 2024.

D. A. Musleh, I. Alkhwaja, A. Alkhwaja, M. Alghamdi, H. Abahussain, F. Alfawaz, N. Min-Allah, and M. M. Abdulqader, “Arabic sentiment analysis of youtube comments: NLP-based machine learning approaches for content evaluation,” Big Data and Cognitive Computing, vol. 7, no. 3, p. 127, 2023.

K. Sharifani and M. Amini, “Machine learning and deep learning: A review of methods and applications,” World Information Technology and Engineering Journal, vol. 10, no. 7, pp. 3897–3904, 2023.

M. Kaur and M. Saini, “Artificial intelligence inspired method for cross-lingual cyberhate detection from low resource languages,” ACM Transactions on Asian and Low-Resource Language Information Processing, vol. 23, no. 9, pp. 1–23, 2024.

N. Hicham, S. Karim, and N. Habbat, “Customer sentiment analysis for arabic social media using a novel ensemble machine learning approach,” Int. J. Electr. Comput. Eng., vol. 13, no. 4, p. 4504, 2023.

N. Raghunathan and K. Saravanakumar, “Challenges and issues in sentiment analysis: A comprehensive survey,” IEEE Access, vol. 11, pp. 69626–69642, 2023.

S. M. Shorman and M. Al-Shoqran, “Analytical study to review of arabic language learning using internet websites,” International Journal of Computer Science & Information Technology (IJCSIT), vol. 11, 2019.

M. Bani-Almarjeh and M.-B. Kurdy, “Arabic abstractive text summarization using RNN-based and transformer-based architectures,” Information Processing & Management, vol. 60, no. 2, p. 103227, 2023.

A. Oussous, F.-Z. Benjelloun, A. A. Lahcen, and S. Belfkih, “ASA: A framework for arabic sentiment analysis,” Journal of Information Science, vol. 46, no. 4, pp. 544–559, 2020.

R. Bensoltane and T. Zaki, “Aspect-based sentiment analysis: An overview in the use of arabic language,” Artificial Intelligence Review, vol. 56, no. 3, pp. 2325–2363, 2023.

O. Badarneh, M. Al-Ayyoub, N. Alhindawi, and Y. Jararweh, “Fine-grained emotion analysis of arabic tweets: A multi-target multi-label approach,” in 2018 IEEE 12th International Conference on Semantic Computing (ICSC), pp. 340–345, 2018.

N. Boudad, R. Faizi, R. O. H. Thami, and R. Chiheb, “Sentiment analysis in arabic: A review of the literature,” Ain Shams Engineering Journal, vol. 9, no. 4, pp. 2479–2490, 2018.

Z. Sun, G. Wang, P. Li, H. Wang, M. Zhang, and X. Liang, “An improved random forest based on the classification accuracy and correlation measurement of decision trees,” Expert Systems with Applications, vol. 237, p. 121549, 2024.

Downloads

Published

2024-12-04

How to Cite

tanveer, A., khan, M., Sarwar , R., Aslam, N., & Fuzail , M. (2024). A Fine Grained Sentiment Analysis of Arabic Language. VAWKUM Transactions on Computer Sciences, 12(2), 178–190. https://doi.org/10.21015/vtcs.v12i2.1926