Preprint / Version 1

Machine Learning Models Accurately Predict Stock Market Crashes Using Macroeconomic Indicators

##article.authors##

  • Aarav Pulsani Texas A&M

DOI:

https://doi.org/10.58445/rars.3810

Keywords:

Machine Learning, Stock Market Crash, Macroeconomic Indicators, Random Forest, Yield Curve Spread, CBOE VIX, Shiller CAPE Ratio, Financial Forecasting, Bear Market Prediction, S&P 500, Classification, Feature Importance

Abstract

Stock market crashes have serious consequences for individuals, businesses, and national economies. The ability to predict such events in advance would be of considerable value to investors and policymakers alike. In this study, machine learning algorithms were used to assess whether monthly macroeconomic indicators are capable of predicting U.S. stock market crashes at a six-month forward horizon. A crash was defined as a decline of 20% or more in the S&P 500 index from its most recent peak, consistent with the conventional definition of a bear market. Monthly data spanning from January 1950 to December 2023 were retrieved from the Federal Reserve Economic Data (FRED) database and other publicly available sources. Ten macroeconomic features were used as inputs to the models, including the yield curve spread, the unemployment rate, the CBOE Volatility Index (VIX), and the Shiller CAPE ratio. Various machine learning algorithms were utilized, including logistic regression, decision trees, random forest, support vector machines (SVM), and a multi-layer perceptron (MLP). All models were optimized using grid search algorithms with cross validation. The random forest classifier was particularly accurate after optimization, achieving an area under the receiver operating characteristic curve (AUC) of 0.88. Feature importance analysis identified the yield curve spread and the VIX as the most predictive features across all models.

Author Biography

Aarav Pulsani, Texas A&M

Aarav Pulsani is an undergraduate student at Texas A&M University studying computer science with a focus on machine learning. His research interests center on applying machine learning methods to problems in finance and economics, particularly the use of macroeconomic indicators to forecast market behavior. This study was conducted independently as part of his ongoing exploration of how data-driven approaches can complement traditional methods in financial prediction.

References

Badillo, S., Banfai, B., Birzele, F., Davydov, I. I., Hutchinson, L., Kam-Thong, T., Siebourg-Polster, J., Steiert, B., & Zhang, J. D. (2020). An introduction to machine learning. Clinical Pharmacology and Therapeutics, 107(4), 871–885. https://doi.org/10.1002/cpt.1796

Brunnermeier, M. K. (2009). Deciphering the liquidity and credit crunch 2007–2008. Journal of Economic Perspectives, 23(1), 77–100. https://doi.org/10.1257/jep.23.1.77

Chawla, N. V., Bowyer, K. W., Hall, L. O., & Kegelmeyer, W. P. (2002). SMOTE: Synthetic minority over-sampling technique. Journal of Artificial Intelligence Research, 16, 321–357. https://doi.org/10.1613/jair.953

Coulombe, P. G., Leroux, M., Stevanovic, D., & Surprenant, S. (2022). How is machine learning useful for macroeconomic forecasting? Journal of Applied Econometrics, 37(5), 920–964. https://doi.org/10.1002/jae.2910

Goyal, A., & Welch, I. (2008). A comprehensive look at the empirical performance of equity premium prediction. Review of Financial Studies, 21(4), 1455–1508. https://doi.org/10.1093/rfs/hhn009

Gu, S., Kelly, B., & Xiu, D. (2020). Empirical asset pricing via machine learning. Review of Financial Studies, 33(5), 2223–2273. https://doi.org/10.1093/rfs/hhaa009

Harvey, C. R. (1988). The real term structure and consumption growth. Journal of Financial Economics, 22(2), 305–333. https://doi.org/10.1016/0304-405X(88)90073-6

Janiesch, C., Zschech, P., & Heinrich, K. (2021). Machine learning and deep learning. Electronic Markets, 31(3), 685–695. https://doi.org/10.1007/s12525-021-00475-2

Kotsiantis, S. (2007). Supervised machine learning: A review of classification techniques. Informatica (Lithuanian Academy of Sciences), 31(3), 249–268.

Kursh, S., & Schnure, A. (2021). An introduction to the "how to" for AI and machine learning. Business Education Innovation Journal, 13(2), 14–23.

Liu, Q., & Wu, Y. (2012). Supervised learning. In N. M. Seel (Ed.), Encyclopedia of the sciences of learning (pp. 3243–3245). Springer. https://doi.org/10.1007/978-1-4419-1428-6_451

Monaco, A., Pantaleo, E., Amoroso, N., Lacalamita, A., Lo Giudice, C., Fonzino, A., Fosso, B., Picardi, E., Tangaro, S., Pesole, G., & Bellotti, R. (2021). A primer on machine learning techniques for genomic applications. Computational and Structural Biotechnology Journal, 19, 4345–4359. https://doi.org/10.1016/j.csbj.2021.07.021

Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., & Duchesnay, E. (2011). Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12, 2825–2830.

Reinhart, C. M., & Rogoff, K. S. (2009). This time is different: Eight centuries of financial folly. Princeton University Press.

Shiller, R. J. (2000). Irrational exuberance. Princeton University Press.

Stock, J. H., & Watson, M. W. (2003). Forecasting output and inflation: The role of asset prices. Journal of Economic Literature, 41(3), 788–829. https://doi.org/10.1257/jel.41.3.788

Whaley, R. E. (2000). The investor fear gauge. Journal of Portfolio Management, 26(3), 12–17. https://doi.org/10.3905/jpm.2000.319728

Downloads

Posted

2026-05-10