Exploring Proximal Policy Optimization in ViZDoom: Training Agents for Complex Tasks with Hyperparameter Optimization
DOI:
https://doi.org/10.21015/vtse.v14i1.2348Abstract
ViZDoom, a Reinforcement Learning (RL) research platform based on the classic first-person shooter game Doom, allows training and evaluating RL agents across a wide range of scenarios that vary in complexity. This research trains Deep Reinforcement Learning (DRL) agents using the modern Proximal Policy Optimization (PPO) algorithm, implemented through Stable-Baselines3 (SB3), across four different ViZDoom scenarios with increasing complexity: Basic, Defend the Center, Health Gathering, and Deadly Corridor. We analyze the impact of hyperparameter tuning on the PPO algorithm in these ViZDoom scenarios using the Optuna framework. While keeping in mind the complexity of the Deadly Corridor scenario, the advanced RL techniques like reward shaping and curriculum learning are employed to achieve the objective of this scenario. The agents are evaluated using the mean episodic reward and the mean episode length as performance metrics in each scenario. The results compare the PPO agents trained on default hyperparameters with the agents trained on optimized hyperparameters in each scenario. The findings demonstrate that hyperparameter optimization has a moderate impact in simple environments, resulting in a 3.9% and a 12.4% increase in the mean episodic rewards of Basic and Defend the Center scenarios, respectively, but shows significant gains in complex scenarios, achieving a 523.7% and a 203.7% improvement in Health Gathering and Deadly Corridor (skill level 5) scenarios, respectively. These findings provide insights into how DRL agents can be trained using the PPO algorithm in complex environments with multiple challenging tasks through appropriate hyperparameter optimization.
References
Z. Li, Q. Ji, X. Ling, and Q. Liu, “A comprehensive review of multi-agent reinforcement learning in video games,” IEEE Transactions on Games, 2025.
M. A. Mirza, A. Lahad, and A. F. Meghji, “A study of Q-learning in the Taxi-v3 environment: Reinforcement learning for optimal navigation through hyperparameter optimization,” KIET Journal of Computing & Information Sciences, vol. 8, no. 1, pp. 54–65, 2025, doi: 10.51153/w4rsbd31.
R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction. Cambridge, MA, USA: MIT Press, 1998. DOI: https://doi.org/10.1109/TNN.1998.712192
C. J. C. H. Watkins and P. Dayan, “Q-learning,” Machine Learning, vol. 8, no. 3–4, pp. 279–292, 1992. DOI: https://doi.org/10.1023/A:1022676722315
V. Mnih, K. Kavukcuoglu, D. Silver, A. Graves, I. Antonoglou, D. Wierstra, and M. Riedmiller, “Playing Atari with deep reinforcement learning,” arXiv:1312.5602, 2013.
R. S. Sutton, D. McAllester, S. Singh, and Y. Mansour, “Policy gradient methods for reinforcement learning with function approximation,” in Advances in Neural Information Processing Systems, vol. 12, 1999.
C. Berner et al., “Dota 2 with large scale deep reinforcement learning,” arXiv:1912.06680, 2019.
V. Mnih et al., “Asynchronous methods for deep reinforcement learning,” in Proc. Int. Conf. Machine Learning (ICML), 2016, pp. 1928–1937.
J. Schulman, S. Levine, P. Moritz, M. I. Jordan, and P. Abbeel, “Trust region policy optimization,” arXiv:1502.05477, 2015.
J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov, “Proximal policy optimization algorithms,” arXiv:1707.06347, 2017.
A. Khan and M. Naeem, “Evaluating reinforcement learning algorithms in first-person shooter games using VizDoom,” Multimedia Tools and Applications, vol. 84, no. 15, pp. 15053–15075, 2025.
M. Wydmuch, M. Kempka, and W. Jaśkowski, “VizDoom competitions: Playing Doom from pixels,” IEEE Transactions on Games, vol. 11, no. 3, pp. 248–259, 2018.
T. Akiba, S. Sano, T. Yanase, T. Ohta, and M. Koyama, “Optuna: A next-generation hyperparameter optimization framework,” in Proc. 25th ACM SIGKDD Int. Conf. Knowledge Discovery and Data Mining, 2019, pp. 2623–2631.
M. Kempka, M. Wydmuch, G. Runc, J. Toczek, and W. Jaśkowski, “VizDoom: A Doom-based AI research platform for visual reinforcement learning,” in Proc. IEEE Conf. Computational Intelligence and Games (CIG), 2016, pp. 1–8. DOI: https://doi.org/10.1109/CIG.2016.7860433
J. Karttunen, A. Kanervisto, V. Kyrki, and V. Hautamäki, “From video game to real robot: The transfer between action spaces,” in Proc. IEEE Int. Conf. Acoustics, Speech and Signal Processing (ICASSP), 2020, pp. 3567–3571.
A. Zakhakharenkov and I. Makarov, “Deep reinforcement learning with DQN vs. PPO in VizDoom,” in Proc. IEEE Int. Symp. Computational Intelligence and Informatics (CINTI), 2021, pp. 131–136.
G. Lample and D. S. Chaplot, “Playing FPS games with deep reinforcement learning,” in Proc. AAAI Conf. Artificial Intelligence, 2017.
M. Kalra and J. Patni, “Playing Doom with deep reinforcement learning,” Recent Trends in Science, Technology, Management and Social Development, p. 42, 2018.
W. Ali, S. Lakho, N. N. Bhatti, and I. A. Memon, “Adaptive bug localization framework for precision-driven bug localization in software engineering,” VFAST Transactions on Software Engineering, vol. 12, no. 3, pp. 230–242, 2024. DOI: https://doi.org/10.21015/vtse.v12i3.1832
M. R. Hossain and D. Timmer, “Machine learning model optimization with hyperparameter tuning approach,” Global Journal of Computer Science and Technology: Neural & Artificial Intelligence, vol. 21, no. 2, p. 31, 2021.
J. A. Ilemobayo et al., “Hyperparameter tuning in machine learning: A comprehensive review,” Journal of Engineering Research and Reports, vol. 26, no. 6, pp. 388–395, 2024.
M. Towers et al., “Gymnasium: A standard interface for reinforcement learning environments,” arXiv:2407.17032, 2024.
S. Narvekar, B. Peng, M. Leonetti, J. Sinapov, M. E. Taylor, and P. Stone, “Curriculum learning for reinforcement learning domains: A framework and survey,” Journal of Machine Learning Research, vol. 21, no. 181, pp. 1–50, 2020.
A. Raffin et al., “Stable-Baselines3: Reliable reinforcement learning implementations,” Journal of Machine Learning Research, vol. 22, no. 268, pp. 1–8, 2021.
M. Abadi et al., “TensorFlow: Large-scale machine learning on heterogeneous distributed systems,” arXiv:1603.04467, 2016.
Google, “Google Colaboratory.” [Online]. Available: https://colab.research.google.com/. Accessed: Jan. 3, 2026.
M. A. Mannan, R. Qamar, I. U. Khan, A. Hussain, S. Ahmed, and J. Khan, “Evaluating the performance of machine learning classifier algorithms for software estimation in software development projects,” VFAST Transactions on Software Engineering, vol. 12, no. 1, pp. 70–78, 2024. DOI: https://doi.org/10.21015/vtse.v12i1.1770
H. Bhuwad, R. Nikam, and D. L. S. Gunjal, “Optimizing DOOM game using reinforcement learning,” in Proc. Int. Conf. AI and Robotics (AIR), 2025, vol. 1, p. 374.
A. Khan et al., “Using VizDoom research platform scenarios for benchmarking reinforcement learning algorithms in first-person shooter games,” IEEE Access, vol. 12, pp. 15105–15132, 2024.
A. Khan and A. Aqeel, “Benchmarking reinforcement learning algorithms in first-person shooter games using VizDoom,” Entertainment Computing, 2025.
R. Spick, T. Bradley, A. Raina, P. V. Amadori, and G. Moss, “Behavioural cloning in VizDoom,” arXiv:2401.03993, 2024.
C. Zhang, H. Hu, Y. Zhou, X. Wang, and E. S. Liu, “HIFAS: A hybrid interactive FPS agent system for large game maps,” IEEE Transactions on Games, 2025.
S. S. Wagner and S. Harmeling, “Just cluster it: An approach for exploration in high-dimensions using clustering and pre-trained representations,” arXiv:2402.03138, 2024.
Downloads
Published
How to Cite
Issue
Section
License
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License (CC-By) that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).
This work is licensed under a Creative Commons Attribution License CC BY