An NLP Approach to Predict and Suggest Next Word In Urdu Typing
DOI:
https://doi.org/10.21015/vtse.v12i4.2011Abstract
The importance of fast speed typing is very important for computerization of contents in any language. Urdu which is a prominent language of south Asia also subjected to computerization and due to lack of resources available the process of computerizing the Urdu content has been hampered by the low speed in Urdu typing. Similarly high demand of Urdu content which needs to be digitized makes it more expensive. During this research we have worked on various aspects of Urdu language and discovered many limitations which exists which are creating hurdles in high-speed typing in Urdu language. As 35+ alphabets are in the Urdu language, the international ISO standard keyboards are only on English alphabets that are 25+ that make a quiet big difference of about 10 alphabets that means we have to press and hold SHIFT key while typing these 10+ alphabets that are wasting our time and slowing our speed of typing so we tried to solve this problem by keeping the standard along as they are. This paper is based on the word prediction and suggestion in Urdu Language (UL) based on a stochastic model, Hidden Markov Model is used to predict the next word, while Unigram Model was also used to suggest the current word and the next upcoming word, N-Gram Model was followed keeping N=2. Now, the biggest achievement in this Paper is POS tagging as each suggestion and prediction is also based upon Tagged words with a dataset of thousands of Tag combinations based upon frequency of occurrence is on test data. This tool is developed to implement this concept for Urdu Language (UL) and tested by regular and new URDU content writers to check their improvements in their typing speeds. We made some programs to let you type less and choose more.
References
The Editors of Encyclopaedia Britannica, “Urdu Language | History, Script, & Words,” Encyclopaedia Britannica. [Online]. Available: https://www.britannica.com/topic/Urdu-language.
M. A. Khan, M. A. Khan, and M. N. Ali, “Design of Urdu Virtual Keyboard,” presented at the Conference on Language & Technology, 2009. [Online]. Available: https://www.semanticscholar.org/paper/Design-of-Urdu-Virtual-Keyboard-Khan-Khan/d385649378ab0f4ec68535e836fb1226930ce340#paper-header.
S. Shahzadi, B. Fatima, K. Malik, and S. M. Sarwar, “Urdu Word Prediction System for Mobile Phones,” World Applied Sciences Journal, vol. 22, no. 1, pp. 113–120, 2013, doi: 10.5829/idosi.wasj.2013.22.01.142.
C. Aliprandi, N. Carmignani, P. Mancarella, and L. Pontecorvo, “An Inflected-Sensitive Letter and Word Prediction System,” Int. J. Computing & Information Sciences, vol. 5, no. 2, pp. 79–85, 2007. [Online]. Available: https://www.semanticscholar.org/paper/An-Inflected-Sensitive-Letter-and-Word-Prediction-Aliprandi-Carmignani/9b4b3ea87f5074620b92b983be3ec365d0b0409c
J. J. Li and A. Nenkova, “Fast and Accurate Prediction of Sentence Specificity,” presented at the AAAI Conf. on Artificial Intelligence, 2015. [Online]. Available: https://www.semanticscholar.org/paper/Fast-and-Accurate-Prediction-of-Sentence-Li-Nenkova/69f5a7032605a88e7bed7bf0c9c2218c5e3f2512.
Md. M. Haque, Md. T. Habib, and Md. M. Rahman, “Automated Word Prediction in Bangla Language Using Stochastic Language Models,” Int. J. VFAST Trans. Software Eng., vol. 12, no. 4, pp. 67–75, 2024.
A. Nandi and H. V. Jagdaish, “Effective Phrase Prediction,” in 33rd Int. Conf. on Very Large Databases, 2007, pp. 219–230.
BBC - Languages, “A Guide to Urdu - The Urdu Alphabet,” BBC.co.uk, 2014. [Online]. Available: https://www.bbc.co.uk/languages/other/urdu/guide/alphabet.shtml.
Urdu Dictionary Board, “Urdu Lughat - Published Volumes,” udb.gov.pk, 2009. [Online]. Available: http://udb.gov.pk/Matbooaat.php.
Center for Language Engineering, “Urdu Parts of Speech (POS) Tagset,” Center for Language Engineering, 2013. [Online]. Available: https://www.cle.org.pk/Downloads/langproc/UrduPOStagger/Urdu%20POS%20Tagset%200.3.pdf.
URDU Typing Test by 10fastfingers.com. [Online]. Available: https://10fastfingers.com/typing-test/urdu/top50.
M. Hassan et al., “Effective Word Prediction in Urdu Language Using Stochastic Model,” Sukkur IBA J. Comput. Math. Sci., vol. 2, no. 2, pp. 38–46, Sep. 2018, doi: https://doi.org/10.30537/sjcms.v2i2.304.
N. Mukhtar, M. Abid Khan, N. Chiragh, S. Nazir, and A. Ullah Jan, “An Intelligent Unsupervised Approach for Handling Context-Dependent Words in Urdu Sentiment Analysis,” Trans. Asian & Low-Resource Lang. Info. Process., vol. 21, no. 5, pp. 1–15, 2022.
Data sourced from Ethnologue, Ethnologue. [Online]. Available: https://www.ethnologue.com/insights/ethnologue200/.
M. V. Koroteev, “BERT: A Review of Applications in Natural Language Processing and Understanding,” arXiv preprint arXiv:2103.11943, 2021.
A. Vaswani et al., “Attention is All You Need,” in Advances in Neural Information Processing Systems, 2017.
A. Tehseen, T. Ehsan, H. B. Liaqat, A. Ali, and A. Al-Fuqaha, “Neural POS Tagging of Shahmukhi by Using Contextualized Word Representations,” J. King Saud Univ. – Comput. Inf. Sci., vol. 35, no. 1, pp. 335–356, Dec. 2022, doi: 10.1016/j.jksuci.2022.12.004.
S. Shaukat, M. Asad, and A. Akram, “Developing an Urdu Lemmatizer Using a Dictionary-Based Lookup Approach,” Appl. Sci., vol. 13, no. 8, p. 5103, Apr. 2023, doi: 10.3390/app13085103.
L. F. Naz, R. Qamar, R. Asif, M. Imran, and S. Ahmed, “Robot Vision Over CosGANs to Enhance Performance with Source-Free Domain Adaptation Using Advanced Loss Function,” Intell. Autom. Soft Comput., vol. 0, no. 0, pp. 1–10, Jan. 2024, doi: 10.32604/iasc.2024.055074.
Downloads
Published
How to Cite
Issue
Section
License
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License (CC-By) that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).
This work is licensed under a Creative Commons Attribution License CC BY