Preprint / Version 1

Hyperparameter Optimization for Deep Reinforcement Learning

An Atari Breakout Case Study

##article.authors##

  • Ken Zheng John Burroughs School

DOI:

https://doi.org/10.58445/rars.1635

Keywords:

Reinforcement Learning, Hyperparameter Optimization, Computer Science, DQN, Atari Breakout

Abstract

In reinforcement learning (RL), a subfield of machine learning, we train systems to perform complex tasks through trial and error. In RL, an agent interacts with an environment, taking actions that generate a cumulative reward. If a series of actions generates a high cumulative reward, those actions will be favorable in the future. Some applications of RL include improving the performance of self-driving cars, improving the performance of large language models, and playing games like Go. While playing games might not have a direct impact on the real world, systems like AlphaGo have helped improve our understanding of RL that aids in more real-world applications. This study uses game completion as a test bed to better understand the underlying mechanisms behind RL, specifically the effects of tuning hyperparameters on a model’s performance. A Deep Q-Learning (DQN) model architecture was chosen for this analysis, and we tuned batch size, learning rate, exploration rate, and discount factor. We hypothesized optimizing these hyperparameters would increase the cumulative reward. These hyperparameters were tuned to maximize the score in the game of Atari Breakout. We found that altering the discount factor to be greater than or less than one results in a much less effective model, whereas tuning hyperparameters that were changed caused little change to the performance. The results of this study can be used to improve the performance of future RL models. All code to reproduce results in this study is available at: https://github.com/BobyWoby/Reinforcement-Learning.

References

Udousoro, I. C. “Machine Learning: A Review”. Semiconductor Science and Information Devices, vol. 2, no. 2, Oct. 2020, pp. 5-14, doi:10.30564/ssid.v2i2.1931.

Li, Yuxi. Deep Reinforcement Learning: An Overview. arXiv:1701.07274, arXiv, 25 Nov. 2018. arXiv.org, https://doi.org/10.48550/arXiv.1701.07274.

Kaelbling, L. P., et al. “Reinforcement Learning: A Survey.” Journal of Artificial Intelligence Research, vol. 4, May 1996, pp. 237–85. www.jair.org, https://doi.org/10.1613/jair.301.

Shakya, Ashish Kumar, et al. “Reinforcement Learning Algorithms: A Brief Survey.” Expert Systems with Applications, vol. 231, Nov. 2023, p. 120495. ScienceDirect, https://doi.org/10.1016/j.eswa.2023.120495.

Hüttenrauch, Maximilian, et al. “Deep Reinforcement Learning for Swarm Systems.” Journal of Machine Learning Research, vol 20, 19 Feb. 2019, p. 1, https://doi.org/10.48550/arXiv.1807.06613

Wang, Letian, et al. Efficient Reinforcement Learning for Autonomous Driving with Parameterized Skills and Priors. arXiv:2305.04412, arXiv, 7 May 2023. arXiv.org, https://doi.org/10.48550/arXiv.2305.04412.

Mnih, Volodymyr, et al. Playing Atari with Deep Reinforcement Learning. arXiv:1312.5602, arXiv, 19 Dec. 2013. arXiv.org, https://doi.org/10.48550/arXiv.1312.5602

Adaptive Discount Factor for Deep Reinforcement Learning in Continuing Tasks with Uncertainty - PMC. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9570626/. Accessed 10 Sept. 2024.

Kiran, Mariam, and Melis Ozyildirim. Hyperparameter Tuning for Deep Reinforcement Learning Applications. arXiv:2201.11182, arXiv, 26 Jan. 2022. arXiv.org, https://doi.org/10.48550/arXiv.2201.11182.

Yu, Tong, Zhu, Hong. Hyper-Parameter Optimization: A Review of Algorithms and Applications. arXiv:2003.05689 , arXiv, 12 Mar. 2020. arXiv.org,

https://doi.org/10.48550/arXiv.2003.05689

Brockman, Greg, et al. OpenAI Gym. arXiv:1606.01540, arXiv, 5 Jun 2016. arXiv.org,

https://doi.org/10.48550/arXiv.1606.01540

Akiba, Takuya, et al. Optuna: A Next-generation Hyperparameter Optimization Framework, arXiv. 25 July 2019. arXiv.org, https://doi.org/10.48550/arXiv.1907.10902

Leeney, William, and Ryan McConville. Uncertainty in GNN Learning Evaluations: The Importance of a Consistent Benchmark for Community Detection. arXiv:2305.06026, arXiv, 25 Nov. 2023. arXiv.org, https://doi.org/10.48550/arXiv.2305.06026.

Li, Yuxi. Reinforcement Learning Applications. arXiv:1908.06973, arXiv, 19 Aug. 2019. arXiv.org, https://doi.org/10.48550/arXiv.1908.06973.

Charpentier, Arthur, et al. “Reinforcement Learning in Economics and Finance.” Computational Economics, vol. 62, no. 1, June 2023, pp. 425–62. Springer Link, https://doi.org/10.1007/s10614-021-10119-4.

Folkers, Andreas, et al. “Controlling an Autonomous Vehicle with Deep Reinforcement Learning.” 2019 IEEE Intelligent Vehicles Symposium (IV), 2019, pp. 2025–31. IEEE Xplore, https://doi.org/10.1109/IVS.2019.8814124.

Du, Yuqing, et al. “Guiding Pretraining in Reinforcement Learning with Large Language Models.” Proceedings of the 40th International Conference on Machine Learning, PMLR, 2023, pp. 8657–77. proceedings.mlr.press, https://proceedings.mlr.press/v202/du23f.html.

Paszke, Adam, et al. PyTorch: An Imperative Style, High-Performance Deep Learning Library. arXiv:1912.01703, arXiv, 3 Dec. 2019. arXiv.org, https://doi.org/10.48550/arXiv.1912.01703

Kingma, Diederik P., and Jimmy Ba. “Adam: A Method for Stochastic Optimization.” arXiv.Org, 22 Dec. 2014, https://arxiv.org/abs/1412.6980v9.

Girshick, Ross. “Fast R-CNN.” arXiv.Org, 30 Apr. 2015, https://arxiv.org/abs/1504.08083v2.

Hu, Linwei, et al. Interpretable Machine Learning based on Functional ANOVA Framework: Algorithms and Comparisons. arXiv:2305.15670, arXiv:2305.15670, arXiv, 25 May. 2023. arXiv.org, https://doi.org/10.48550/arXiv.2305.15670

Downloads

Posted

2024-09-19