Distinguishing Human-Generated and AI-Generated Academic Writing: A Machine Learning Benchmark Study
DOI:
https://doi.org/10.21015/vtse.v14i1.2274Abstract
The rapid adoption of large language models (LLMs) such as ChatGPT has raised critical questions about authorship, originality, and integrity in academic writing. Unlike conventional plagiarism testing tools, AI-generated or AI-rephrased text can preserve the original meaning and context of the text while modifying the writing style, making it challenging to detect using standard similarity checks. This study addresses this challenge by creating a domain-specific corpus of postgraduate-level academic texts. The corpus contains 22,520 samples, equally divided between human-written text and AI-rephrased text. All samples were preprocessed and represented using two common techniques: TF-IDF and Word2Vec. The dataset was evaluated using well-known machine learning and deep learning models, including Logistic Regression, Support Vector Machines, Recurrent Neural Networks, and transformer-based models BERT and T5. The results show that linear and sequential models provide low baseline performance, with accuracy between 50-54%. While BERT significantly outperforms the other models, achieving 83% precision along with a high recall rate. Confusion matrix analysis further shows that traditional models tend to overpredict AI authorship, whereas BERT demonstrates strong reliability in distinguishing between human-written and AI-generated text. The results show that transformer-based models are more effective for authorship verification in academic settings. They also emphasize the trade-offs among interpretability, computational cost, and predictive performance. In general, this study offers some important recommendations for the creation of credible, transparent, and domain-sensitive AI detectors for academia.
References
E. Clark, T. August, S. Serrano, N. Haduong, S. Gururangan, and N. A. Smith, “All that's ‘human’ is not gold: Evaluating human evaluation of generated text,” in Proc. 59th Annu. Meeting Assoc. Comput. Linguistics and 11th Int. Joint Conf. Natural Language Processing (ACL-IJCNLP), 2021, pp. 7282–7296, doi: 10.18653/v1/2021.acl-long.565.
Y. Belinkov and J. Glass, “Analysis methods in neural language processing: A survey,” Trans. Assoc. Comput. Linguistics, vol. 7, pp. 49–72, 2019.
H. Huang et al., “Can LLM-generated misinformation be detected: A study on Cyber Threat Intelligence,” Future Gener. Comput. Syst., vol. 173, p. 107877, 2025, doi: 10.1016/j.future.2025.107877.
S. Gehrmann, H. Strobelt, and A. Rush, “GLTR: Statistical detection and visualization of generated text,” in Proc. 57th Annu. Meeting Assoc. Comput. Linguistics: Syst. Demonstrations, Florence, Italy, 2019, pp. 111–116, doi: 10.18653/v1/P19-3019.
F. Alqasemi et al., “A comparative study for Yemeni poets detection using TEXT-CNN and RNN-LSTM text classification,” in Proc. 5th Int. Conf. Emerging Smart Technol. Appl. (eSmarTA), 2025, pp. 1–8.
E. Al-Buraihy, D. Wang, R. Khan, and M. Ullah, “An ML-based classification scheme for analyzing the social network reviews of Yemeni people,” Int. Arab J. Inf. Technol., vol. 19, no. 6, pp. 904–914, 2022, doi: 10.34028/iajit/19/6/8.
Z. Iqbal et al., “Handling illusive text in document to improve accuracy of plagiarism detection algorithm,” in Proc. 11th Int. Conf. Robot., Vision, Signal Process. Power Appl., vol. 829, Lecture Notes Electr. Eng., Springer, 2022, pp. 93–100.
T. Kehkashan et al., “AI-generated text detection: A comprehensive review of methods, datasets, and applications,” Comput. Sci. Rev., vol. 58, p. 100793, 2025, doi: 10.1016/j.cosrev.2025.100793.
K. Krishna et al., “Paraphrasing evades detectors of AI-generated text, but retrieval is an effective defense,” Adv. Neural Inf. Process. Syst., vol. 36, pp. 27469–27500, 2023.
N. Selwyn, “The future of AI and education: Some cautionary notes,” Eur. J. Educ., vol. 57, no. 4, pp. 622–634, 2022.
I. Solaiman et al., “Release strategies and the social impacts of language models,” arXiv preprint arXiv:1908.09203, 2019.
H. Stiff and F. Johansson, “Detecting computer-generated disinformation,” Int. J. Data Sci. Anal., vol. 13, no. 4, pp. 363–383, 2022.
A. de Santana Correia and E. L. Colombini, “Attention, please! A survey of neural attention models in deep learning,” Artif. Intell. Rev., vol. 55, pp. 6037–6124, 2022.
J. P. Wahle et al., “How large language models are transforming machine-paraphrase plagiarism,” in Proc. EMNLP, 2022, pp. 952–963, doi: 10.18653/v1/2022.emnlp-main.62.
R. Zellers et al., “Defending against neural fake news,” Adv. Neural Inf. Process. Syst., vol. 32, 2019.
A. M. Salih et al., “A perspective on explainable artificial intelligence methods: SHAP and LIME,” Adv. Intell. Syst., vol. 7, no. 1, p. 2400304, 2025.
W. Antoun et al., “Towards a robust detection of language model-generated text: Is ChatGPT that easy to detect?,” in Actes 30e Conf. Traitement Automatique Langues Naturelles (TALN), 2023.
I. Katib et al., “Differentiating ChatGPT-generated text and human text using machine learning,” Mathematics, vol. 11, no. 15, p. 3400, 2023.
Z. Su et al., “HC3 Plus: A semantic-invariant human–ChatGPT comparison corpus,” arXiv preprint arXiv:2309.02731, 2023, doi: 10.48550/arXiv.2309.02731.
C. Vasilatos et al., “HowkGPT: Investigating the detection of ChatGPT-generated university student homework,” arXiv preprint arXiv:2305.18226, 2023.
A. Uchendu et al., “TURINGBENCH: A benchmark environment for Turing test in the age of neural text generation,” in Findings ACL: EMNLP, 2021, pp. 2001–2016, doi: 10.18653/v1/2021.findings-emnlp.172.
H. P. Nguyen et al., “Logistic regression on guard of students’ academic performance,” in Artificial Intelligence and System Engineering (CoMeSySo 2024), LNNS, vol. 1490, Springer, 2025, doi: 10.1007/978-3-031-96759-7_26.
F. Zhao and F. Yu, “Enhancing multi-class news classification through BERT-augmented prompt engineering,” in Proc. 10th Int. Sci. Practical Conf. Problems and Prospects of Modern Science and Education, 2024, p. 297.
S. Gul et al., “Tanz-Indicator: A novel framework for detection of Perso-Arabic-scripted Urdu sarcastic opinions,” Wireless Commun. Mobile Comput., vol. 2022, p. 9151890, 2022.
R. Khan, M. Ullah, and B. Shafi, “Web search privacy evaluation metrics,” in Protecting User Privacy in Web Search Utilization, IGI Global, 2023, pp. 46–62.
N. Fatima et al., “Sensors faults classification and faulty signals reconstruction using deep learning,” IEEE Access, vol. 12, pp. 1–10, 2024, doi: 10.1109/ACCESS.2024.3425408.
J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of deep bidirectional transformers for language understanding,” in Proc. NAACL-HLT, 2019, pp. 4171–4186.
C. Raffel et al., “Exploring the limits of transfer learning with a unified text-to-text transformer,” J. Mach. Learn. Res., vol. 21, no. 140, pp. 1–67, 2020.
G. Canbek, T. Taskaya Temizel, and S. Sagiroglu, “PToPI: A comprehensive review of binary classification performance measures,” SN Comput. Sci., vol. 4, no. 1, p. 13, 2022.
A. R. Bangash, “Sample Dataset: Human vs AI Authored Academic Writing,” GitHub Repository, 2025. [Online]. Available: https://github.com/alirazabangash1/Sample-Dataset
S. D. Aldeen, T. Abbas, and A. R. Abbas, “Review of detecting text generated by ChatGPT using machine and deep-learning models: A tools and methods analysis,” Diyala J. Eng. Sci., pp. 34–54, 2025.
Downloads
Published
How to Cite
Issue
Section
License
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License (CC-By) that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).
This work is licensed under a Creative Commons Attribution License CC BY