AI vs. Human Programmers: Complexity and Performance in Code Generation

Samina Azeem; Muhammad Shumail Naveed; Muhammad Sajid; Imran  Ali

doi:10.21015/vtcs.v13i1.2043

Authors

Samina Azeem Department of Computer Science, Sardar Bahadur Khan Women University, Pakistan https://orcid.org/0009-0006-8231-5146
Muhammad Shumail Naveed Department of Computer Science & Information Technology, University of Balochistan, Quetta, Pakistan https://orcid.org/0000-0003-3334-0848
Muhammad Sajid Department of Computer Science & Information Technology, University of Balochistan, Quetta, Pakistan https://orcid.org/0000-0002-8017-8129
Imran Ali Department of Computer Science & Information Technology, University of Balochistan, Quetta, Pakistan https://orcid.org/0000-0003-4149-3975

DOI:

https://doi.org/10.21015/vtcs.v13i1.2043

Abstract

Large language models, like ChatGPT, have shown the ability to do a variety of tasks in different fields, and this has increased efficiency greatly. However, their increasing use is causing concern about the potential job displacement, particularly in the technical fields. While there have been many studies on the performance of large language models in technical fields, there is a notable absence in assessing their performances in programming. This study fills this gap by comparing ChatGPT (GPT-4) and human experts in the coding discipline to determine if ChatGPT has advanced to a point where it can replace human programmers. To accomplish this goal, this study has produced 300 Python programs with ChatGPT (GPT-4) and compared them with functionally equivalent programs written by three experienced human programmers. The evaluation included both quantitative and qualitative evaluations using measures such as Halstead Complexity, Cyclomatic Complexity, and expert judgment by two human evaluators. The results showed statistically significant differences between the ChatGPT-generated and human-written code. Programs that were generated by ChatGPT were shown to be verbose, complex, and resource demanding, which is reflected in higher program volume, difficulty, and cyclomatic complexity scores. In qualitative terms, ChatGPT's code was easier to read, but lagged behind in some key areas, such as the quality of documentation, structuring of functions, and compliance with coding standards. On the other hand, human-written programs performed well in terms of maintainability, error handling, and dealing with edge cases. Although ChatGPT was found to be incredibly efficient at creating working code, the output needed a lot of review and refinement to be considered standard. The study concluded while ChatGPT is a useful tool for generating code, it has not yet reached the level needed to replace human expertise in programming.

References

M. S. Naveed, "Measuring the programming complexity of C and C++ using Halstead metrics," Univ. of Sindh J. Inf. Commun. Technol., vol. 5, no. 4, pp. 2521–5582, 2021.

M. S. Naveed, "Comparison of C++ and Java in implementing introductory programming algorithms," QUEST Res. J., vol. 19, no. 1, pp. 95–103, 2021.

K. Wilson, "Introduction to computer programming," in The Absolute Beginner’s Guide to Python Programming: A Step-by-Step Guide with Examples and Lab Exercises, Springer, 2022, pp. 1–13.

M. Shoaib, M. S. Naveed, A. A. Sanjrani, A. Ahmed, et al., "A comparative study of contemporary programming languages in implementation of classical algorithms," J. Inf. Commun. Technol. (JICT), vol. 14, no. 1, 2021.

P. Li, "The research for software programming of English online learning," in Proc. IEEE 2nd Int. Conf. Electron. Technol., Commun. Inf. (ICETCI), 2022, pp. 707–710.

L. A. Kumar and D. K. Renuka, Deep Learning Approach for Natural Language Processing, Speech, and Computer Vision: Techniques and Use Cases. CRC Press, 2023.

G. Franceschelli and M. Musolesi, "Creativity and machine learning: A survey," ACM Comput. Surv., vol. 56, no. 11, pp. 1–41, 2024.

M. S. Naveed, "Quantifying similarities: Oncology documents from Google Bard and ChatGPT," Int. J. Innov. Sci. Technol., vol. 5, no. 4, pp. 773–786, 2023.

Y. K. Dwivedi et al., "Artificial intelligence (AI): Multidisciplinary perspectives on emerging challenges, opportunities, and agenda for research, practice and policy," Int. J. Inf. Manag., vol. 57, p. 101994, 2021.

S. Maleki Varnosfaderani and M. Forouzanfar, "The role of AI in hospitals and clinics: Transforming healthcare in the 21st century," Bioengineering, vol. 11, no. 4, p. 337, 2024.

S. Kolluri, J. Lin, R. Liu, Y. Zhang, and W. Zhang, "Machine learning and artificial intelligence in pharmaceutical research and development: A review," AAPS J., vol. 24, pp. 1–10, 2022.

G. Albahri et al., "Enhancing essential grains yield for sustainable food security and bio-safe agriculture through latest innovative approaches," Agronomy, vol. 13, no. 7, p. 1709, 2023.

J. C. Tellez Gaytan et al., "AI-based prediction of capital structure: Performance comparison of ANN, SVM and LR models," Comput. Intell. Neurosci., vol. 2022, no. 1, p. 8334927, 2022.

L. Chen, P. Chen, and Z. Lin, "Artificial intelligence in education: A review," IEEE Access, vol. 8, pp. 75264–75278, 2020.

A. A. Khan et al., "BDLT-IoMT—A novel architecture: SVM machine learning for robust and secure data processing in Internet of Medical Things with blockchain cybersecurity," J. Supercomput., vol. 81, no. 1, pp. 1–22, 2025.

A. A. Khan et al., "BAIoT-EMS: Consortium network for small-medium enterprises management system with blockchain and augmented intelligence of things," Eng. Appl. Artif. Intell., vol. 141, p. 109838, 2025.

A. A. Khan et al., "Blockchain-enabled infrastructural security solution for serverless consortium fog and edge computing," PeerJ Comput. Sci., vol. 10, p. e1933, 2024.

A. A. Khan et al., "Digital forensics for the socio-cyber world (DFSCW): A novel framework for deepfake multimedia investigation on social media platforms," Egypt. Inform. J., vol. 27, p. 100502, 2024.

H. H. Rashidi et al., "Introduction to artificial intelligence (AI) and machine learning (ML) in pathology & medicine: Generative & non-generative AI basics," Mod. Pathol., p. 100688, 2025.

H. Javed, S. El-Sappagh, and T. Abuhmed, "Robustness in deep learning models for medical diagnostics: Security and adversarial challenges towards robust AI applications," Artif. Intell. Rev., vol. 58, no. 1, pp. 1–107, 2025.

J. Segessenmann, T. Stadelmann, A. Davison, and O. Dürr, "Assessing deep learning: A work program for the humanities in the age of artificial intelligence," AI Ethics, vol. 5, no. 1, pp. 1–32, 2025.

S. Shoukat et al., "Trust my IDS: An explainable AI integrated deep learning-based transparent threat detection system for industrial networks," Comput. Secur., vol. 149, p. 104191, 2025.

L. Albshaier, S. Almarri, and A. Albuali, "Federated learning for cloud and edge security: A systematic review of challenges and AI opportunities," Electronics, vol. 14, no. 5, p. 1019, 2025.

D. Thakur, A. Guzzo, G. Fortino, and F. Piccialli, "Green federated learning: A new era of green aware AI," ACM Comput. Surv., 2025.

A. Bucaioni, H. Ekedahl, V. Helander, and P. T. Nguyen, "Programming with ChatGPT: How far can we go?," Mach. Learn. Appl., vol. 15, p. 100526, 2024.

R. Jain, J. Thanvi, and A. Subasinghe, "The evolution of ChatGPT for programming: A comparative study," Eng. Res. Express, 2025.

A. Koubaa, B. Qureshi, A. Ammar, Z. Khan, W. Boulila, and L. Ghouti, “Humans are still better than ChatGPT: Case of the IEEEXtreme competition,” *Heliyon*, vol. 9, no. 11, p. e21624, 2023.

R. Yilmaz and F. G. K. Yilmaz, “Augmented intelligence in programming learning: Examining student views on the use of ChatGPT for programming learning,” *Comput. Hum. Behav.: Artif. Humans*, vol. 1, no. 2, p. 100005, 2023.

L. Grundner and B. Neuhofer, “The bright and dark sides of artificial intelligence: A futures perspective on tourist destination experiences,” *J. Destin. Mark. Manag.*, vol. 19, p. 100511, 2021.

V. Taecharungroj, “‘What can ChatGPT do?’ Analyzing early reactions to the innovative AI chatbot on Twitter,” *Big Data Cogn. Comput.*, vol. 7, no. 1, p. 35, 2023.

O. Temsah et al., “Overview of early ChatGPT’s presence in medical literature: Insights from a hybrid literature review by ChatGPT and human experts,” *Cureus*, vol. 15, no. 4, 2023.

J. Steiss et al., “Comparing the quality of human and ChatGPT feedback of students’ writing,” *Learn. Instr.*, vol. 91, p. 101894, 2024.

D. Duong and B. D. Solomon, “Analysis of large-language model versus human performance for genetics questions,” *Eur. J. Hum. Genet.*, vol. 32, pp. 466–468, 2024.

F. Breithaupt et al., “Humans create more novelty than ChatGPT when asked to retell a story,” *Sci. Rep.*, vol. 14, p. 875, 2024.

A. Y. Wang et al., “Assessment of pathology domain-specific knowledge of ChatGPT and comparison to human performance,” *Arch. Pathol. Lab. Med.*, vol. 148, no. 10, pp. 1152–1158, 2024.

R. Nyqvist, A. Peltokorpi, and O. Seppänen, “Can ChatGPT exceed humans in construction project risk management?,” *Eng. Constr. Archit. Manag.*, vol. 31, no. 13, pp. 223–243, 2024.

M. Padovan et al., “ChatGPT in occupational medicine: A comparative study with human experts,” *Bioengineering*, vol. 11, no. 1, p. 57, 2024.

M. S. Naveed, “Pedagogical suitability: A software metrics-based analysis of Java and Python,” *Int. J. Innov. Sci. Technol.*, vol. 6, no. 4, pp. 1956–1967, 2024.

A. Odeh, M. Odeh, N. Odeh, and H. Odeh, “Machine learning model for measuring cyclomatic complexity of source code,” in *Proc. 2023 Int. Conf. Intell. Comput., Commun., Netw. Services (ICCNS)*, pp. 149–153, IEEE, 2023.

AI vs. Human Programmers: Complexity and Performance in Code Generation

Authors

DOI:

Abstract

References

Downloads

Published

How to Cite

Issue

Section

License

Developed By

Information