Preprint / Version 1

Integrating Machine Learning with Plasma Lipidomics to Predict Alzheimer’s Disease Progression in Patients with Mild Cognitive Impairment

##article.authors##

  • Luke Betlow None

DOI:

https://doi.org/10.58445/rars.2869

Keywords:

Machine Learning, Lipidomics, Alzheimer's Disease

Abstract

Alzheimer’s disease (AD) remains one of the most challenging neurodegenerative disorders, particularly due to the difficulty of early diagnosis and lack of predictability in the progression of the disease in patients who already exhibit mild cognitive impairment (MCI). Recent advances in lipidomics and machine learning offer new avenues for uncovering biological markers that may be predictive of disease development. This study explores whether a machine learning model trained on plasma lipidomic data and select biomarkers can effectively identify MCI patients at higher risk of progressing to AD. We used a publicly available dataset of 212 participants, focusing specifically on a subgroup of 89 MCI patients who progressed to developing AD. Clinical metadata were reduced to retain only lipidomic features and a derived Tau Ratio (CSF p-tau / total tau), and machine learning classifiers were trained to predict binary progression outcomes. Models evaluated included Random Forest, Logistic Regression, Support Vector Machine, Decision Tree, Naive Bayes, and a neural network. The best-performing models (Random Forest and Decision Tree) achieved accuracy scores of 0.7778, with balanced precision and recall scores. Feature importance derived from the Decision Tree model revealed a set of lipidomic variables with high predictive contribution. These findings demonstrate that lipidomic profiles, particularly when enriched with biologically relevant ratios like Tau Ratio, can contribute meaningful signals to classification models. While exploratory in nature, this work supports the utility of machine learning for neurodegenerative disease prediction and offers a reproducible pipeline for future studies aiming to integrate lipidomics into clinical screening tools for AD risk. 

References

Aizenstein, H. J., Nebes, R. D., Saxton, J. A., Price, J. C., Mathis, C. A., Tsopelas, N. D., ... & Klunk, W. E. (2008). Frequent amyloid deposition without significant cognitive impairment among the elderly. Archives of Neurology, 65(11), 1509–1517. https://doi.org/10.1001/archneur.65.11.1509

Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32. https://doi.org/10.1023/A:1010933404324

Cortes, C., & Vapnik, V. (1995). Support-vector networks. Machine Learning, 20(3), 273–297. https://doi.org/10.1007/BF00994018

Dakterzada, F., Jové, M., Huerto, R., Carnes, A., Sol, J., Pamplona, R., & Piñol-Ripoll, G. (2023). Changes in plasma neutral and ether-linked lipids are associated with the pathology and progression of Alzheimer’s disease. Aging and Disease, 14(5), 1728–1738. https://doi.org/10.14336/AD.2023.0221

Dean, J. M., & Lodhi, I. J. (2018). Structural and functional roles of ether lipids. Protein & Cell, 9(2), 196–206. https://doi.org/10.1007/s13238-017-0423-5

Farmer, B. C., Walsh, A. E., Kluemper, J. C., & Johnson, L. A. (2020). Lipid droplets in neurodegenerative disorders. Frontiers in Neuroscience, 14, 742. https://doi.org/10.3389/fnins.2020.00742

Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep learning. MIT Press.

Hosmer, D. W., Lemeshow, S., & Sturdivant, R. X. (2013). Applied logistic regression (3rd ed.). Wiley.

Huynh, T. P. V., Davis, A. A., Ulrich, J. D., & Holtzman, D. M. (2017). Apolipoprotein E and Alzheimer’s disease: The influence of apolipoprotein E on amyloid-β and other amyloidogenic proteins. Journal of Lipid Research, 58(5), 824–836. https://doi.org/10.1194/jlr.R075481

Jack, C. R., Bennett, D. A., Blennow, K., Carrillo, M. C., Dunn, B., Haeberlein, S. B., ... & Silverberg, N. (2018). NIA-AA research framework: Toward a biological definition of Alzheimer’s disease. Alzheimer’s & Dementia, 14(4), 535–562. https://doi.org/10.1016/j.jalz.2018.02.018

Nichols, E., Steinmetz, J. D., Vollset, S. E., Fukutaki, K., Chalek, J., Abd-Allah, F., ... & Murray, C. J. L. (2022). Estimation of the global prevalence of dementia in 2019 and forecasted prevalence in 2050: An analysis for the Global Burden of Disease Study 2019. The Lancet Public Health, 7(2), e105–e125. https://doi.org/10.1016/S2468-2667(21)00249-8

Ossenkoppele, R., Schonhaut, D. R., Schöll, M., Lockhart, S. N., Ayakta, N., Baker, S. L., ... & Rabinovici, G. D. (2018). Tau PET patterns mirror clinical and neuroanatomical variability in Alzheimer's disease. Brain, 141(5), 1551–1567. https://doi.org/10.1093/brain/aww027

Quinlan, J. R. (1986). Induction of decision trees. Machine Learning, 1(1), 81–106. https://doi.org/10.1007/BF00116251

Rish, I. (2001, August). An empirical study of the naive Bayes classifier. In IJCAI 2001 Workshop on Empirical Methods in Artificial Intelligence (Vol. 3, pp. 41–46). IBM Research.

Toledo, J. B., Arnold, M., Kastenmüller, G., Chang, R., Baillie, R. A., Han, X., ... & Saykin, A. J. (2017). Metabolic network failures in Alzheimer’s disease: A biochemical roadmap. Alzheimer’s & Dementia, 13(9), 965–984. https://doi.org/10.1016/j.jalz.2017.01.020

Downloads

Posted

2025-08-03

Categories