A Comparative Analysis of Regressor Machine Learning Models in Forecasting SPDR S&P 500 ETF Trust (SPY) Movements
DOI:
https://doi.org/10.58445/rars.3065Keywords:
Finance, Computer Science, Machine Learning, Stock Market PredictionAbstract
Stock market forecasting remains one of the most challenging tasks in finance due to the market’s high volatility, complex dynamics, and sensitivity to external events. This study investigates the short-term predictive performance of five regressor models on the S&P 500 ETF Index (SPY). There are two experimental setups for evaluating models: one incorporating technical indicators and one excluding them. The aim is to determine whether technical indicators enhance prediction accuracy and to identify which model is most effective for short-term forecasting. Also, the model’s performance is assessed using MAE, RMSE, R2 score, and directional accuracy. The results showcase that the R2 score is a poor indicator in short-term financial datasets. Directional accuracy with technical indicators ranged from 51-54%, a result better than randomly guessing. Without technical indicators, models ranged from 45-48%, highlighting the importance of technical indicators in predictions.
References
Rodriguez, F. S., P. Norouzzadeh, Z. Anwar, E. Snir, and B. Rahmani. “A Machine Learning Approach to Predict the S&P 500 Absolute Percent Change.” Discover Artificial Intelligence, vol. 4, no. 1, 2024, https://doi.org/10.1007/s44163-024-00104-9. Accessed 10 July 2025
https://doi.org/10.1007/s44163-024-00104-9
Montgomery, Douglas. “Introduction to Linear Regression Analysis.” Journal of the American Statistical Association, vol. 88, no. 421, 1993, p. 383. https://doi.org/10.2307/2290746. Accessed 26 July 2025
https://doi.org/10.2307/2290746
Poole, Michael A., and Patrick N. O’Farrell. “The Assumptions of the Linear Regression Model.” Transactions of the Institute of British Geographers, no. 52, 1971, p. 145. https://doi.org/10.2307/621706. Accessed 27 July 2025.
https://doi.org/10.2307/621706
Biau G., Cadre B., and Rouvière L. “Accelerated Gradient Boosting.” Machine Learning, vol. 108, no. 6, 2019, pp. 971–92. https://doi.org/10.1007/s10994-019-05787-1. Accessed 2 August 2025
https://doi.org/10.1007/s10994-019-05787-1
Natekin, Alexey, and Alois Knoll. “Gradient Boosting Machines, a Tutorial.” Frontiers in Neurorobotics, vol. 7, 2013, https://doi.org/10.3389/fnbot.2013.00021. Accessed 26 July 2025
https://doi.org/10.3389/fnbot.2013.00021
Rigatti, Steven J. “Random Forest.” Journal of Insurance Medicine, vol. 47, no. 1, 2017, pp. 31–39. https://doi.org/10.17849/insm-47-01-31-39.1. Accessed 10 July 2025
https://doi.org/10.17849/insm-47-01-31-39.1
Speiser L. Jaime, Michael E. Miller, Janet Tooze, and Edward Ip. “A Comparison of Random Forest Variable Selection Methods for Classification Prediction Modeling.” Expert Systems With Applications, vol. 134, 2019, pp. 93–101. https://doi.org/10.1016/j.eswa.2019.05.028. Accessed 16 July 2025
https://doi.org/10.1016/j.eswa.2019.05.028
Svetnik, Vladimir, Andy Liaw, Christopher Tong, J. Christopher Culberson, Robert P. Sheridan, and Bradley P. Feuston. “Random Forest: A Classification and Regression Tool for Compound Classification and QSAR Modeling.” Journal of Chemical Information and Computer Sciences, vol. 43, no. 6, 2003, pp. 1947–58. https://doi.org/10.1021/ci034160g. Accessed 26 July 2025
https://doi.org/10.1021/ci034160g
Taylor, Sean J., and Benjamin Letham. “Forecasting at Scale.” The American Statistician, vol. 72, no. 1, 2017, pp. 37–45. https://doi.org/10.1080/00031305.2017.1380080. Accessed 20 July 2025
https://doi.org/10.1080/00031305.2017.1380080
Shen, Justin, Davesh Valagolam, and Serena McCalla. “Prophet Forecasting Model: A Machine Learning Approach to Predict the Concentration of Air Pollutants (PM2.5, PM10, O3, NO2, SO2, CO) in Seoul, South Korea.” PeerJ, vol. 8, 2020, p. e9961. https://doi.org/10.7717/peerj.9961. Accessed 20 July 2025
https://doi.org/10.7717/peerj.9961
Sonkavde, Gaurang, Deepak Sudhakar Dharrao, Anupkumar M. Bongale, Sarika T. Deokate, Deepak Doreswamy, and Subraya Krishna Bhat. “Forecasting Stock Market Prices Using Machine Learning and Deep Learning Models: A Systematic Review, Performance Analysis and Discussion of Implications.” International Journal of Financial Studies, vol. 11, no. 3, 2023, p. 94. https://doi.org/10.3390/ijfs11030094. Accessed 4 August 2025
https://doi.org/10.3390/ijfs11030094
Santhanam, Ramraj, Nishant Uzir, Sunil Raman, and Shatadeep Banerjee. “Experimenting with XGBoost Algorithm for Prediction and Classification of Different Datasets.” ResearchGate, 2017, www.researchgate.net/publication/318132203_Experimenting_XGBoost_Algorithm_for_Prediction_and_Classification_of_Different_Datasets. Accessed 25 July 2025
www.researchgate.net/publication/318132203_Experimenting_XGBoost_Algorithm_for_Prediction_and_Classification_of_Different_Datasets
Downloads
Posted
Categories
License
Copyright (c) 2025 Puranjay Haldankar, Seungwoo Lee

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.