Preprint / Version 1

Early Detection of Drug Toxicity using Neural Networks

##article.authors##

  • Ishana Saroha Massachusetts Academy of Math and Science

DOI:

https://doi.org/10.58445/rars.2677

Keywords:

Drug toxicity, Artificial Intelligence (AI), Machine Learning, QSAR, Graph Neural Networks (GNNs), Support Vector Machines (SVMs), Tox21

Abstract

Toxicity prediction of drugs is a critical step in the drug development process, as it evaluates the safety of drugs. An abundance of resources go into the development of new drugs, yet only 12% of all drugs are considered by the FDA, since many potential drugs are toxic. By the time toxicity has been identified in conventionally developed drugs, anywhere from $1 to $2 billion could’ve been invested. Therefore, it is imperative to have an early detection of drug toxicity. AI can be used to predict toxicity, by using QSAR and machine learning methods. Some ML based (DNN, SVM, RF) solutions have been proposed that use only molecular features of the compounds, not molecular structure. The goal of this project was to create a GNN ML model (which uses both atomic features and molecular structure information) that could make better toxicity predictions. Tox21 toxicity dataset was used for all training and evaluation. Three GNN models were created and then compared to a SVM model. Statistical analysis of results showed that the GNN models performed better than the SVM model, with the GNN models having better F1-scores (3.34%-6.44% improvement) and MCC values (5.48%-9.40% improvement). Results of this project show that GNN-based models have better toxicity prediction compared to SVM-based models. GNNs have great potential to augment existing QSAR methods used in predicting toxicity in the drug development industry, thus reducing cost, time and resources invested, and alleviating ethical concerns of animals/clinical trials when compared to conventionally developed drugs. This model can help in molecule toxicity prediction for pharmaceutical and other industries.

References

AnaConda. (n.d.). Free Download. Anaconda. Retrieved September 24, 2023, from https://www.anaconda.com/download

Britannica. (2023, September 21). Artificial Intelligence. Britannica. Retrieved September 21, 2023, from https://www.britannica.com/technology/artificial-intelligence

Brown, S. (2021, April 21). Machine learning, explained. MIT Management Sloane School. Retrieved September 21, 2023, from https://mitsloan.mit.edu/ideas-made-to-matter/machine-learning-explained

Columbia Mailman School of Public Health. (2020, November 30). What is Toxicology? Columbia Mailman School of Public Health. Retrieved September 17, 2023, from https://www.publichealth.columbia.edu/news/what-toxicology

Dorato, M. A., & Buckley, L. A. (2007). Toxicology testing in drug discovery and development. NIH PubMed, 31(1). https://doi.org/10.1002/0471141755.tx1901s31

Food and Drug Administration [FDA]. (n.d.). The Drug Development Process. U.S. Food and Drug Administration. Retrieved September 17, 2023, from https://www.fda.gov/patients/learn-about-drug-and-device-approvals/drug-development-process

GitHub. (n.d.). JupyterLab-Desktop. Github. Retrieved September 24, 2023, from https://github.com/jupyterlab/jupyterlab-desktop

Guengerich, F. P. (2010). Mechanisms of drug toxicity and relevance to pharmaceutical development. NIH National Library of Medicine, 26(1), 3-14. https://doi.org/10.2133%2Fdmpk.dmpk-10-rv-062

Guha, R. (2013). On exploring structure–activity relationships. Methods in Molecular Biology, 81-94. https://doi.org/10.1007/978-1-62703-342-8_6

IBM. (2021, March 8).

Support Vector Machine Models. IBM. Retrieved October 7, 2023, from https://www.ibm.com/docs/en/spss-modeler/18.1.0?topic=nodes-support-vector-machine-models

IBM. (n.d. a). What are convolutional neural networks? IBM. Retrieved September 24, 2023, from https://www.ibm.com/topics/convolutional-neural-networks

IBM. (n.d. b). What are recurrent neural networks? IBM. Retrieved September 24, 2023, from https://www.ibm.com/topics/recurrent-neural-networks

IBM. (n.d. c). What is a neural network? IBM. Retrieved September 21, 2023, from https://www.ibm.com/topics/neural-networks

K, B. (2020, December 22). Everything You Need To Know About Jupyter Notebooks. Towards Data Science. Retrieved September 24, 2023, from https://towardsdatascience.com/everything-you-need-to-know-about-jupyter-notebooks-10770719952b

Labonne, M. (2023). Hands-On Graph Neural Networks Using Python. Packt Publishing. THE LEADING TEAMS' AUC RESULTS ON THE FINAL TEST SET IN THE TOX21 CHALLENGE [Image]. (n.d.). https://www.frontiersin.org/articles/10.3389/fenvs.2015.00080/full

Lo, Y.-C., Rensi, S. E., Torng, W., & Altman, R. B. (2018). Machine learning in chemoinformatics and drug discovery. Drug Discovery Today, 23(8), 1538-1546. https://doi.org/10.1016/j.drudis.2018.05.010

MatchTrial. (2020, August 25). How long does it take to develop a new drug? MatchTrial. Retrieved September 15, 2023, from https://matchtrial.health/en/how-long-does-it-take-to-develop-a-new-drug/#

Mayr, A., Klambauer, G., Unterthiner, T., & Hochreiter, S. (n.d.). DeepTox: Toxicity Prediction using Deep Learning. Frontiers. https://www.frontiersin.org/articles/10.3389/fenvs.2015.00080/full

Menzli, A. (2023, September 11). Graph Neural Network and Some of GNN Applications: Everything You Need to Know. Neptune. Retrieved September 21, 2023, from https://neptune.ai/blog/graph-neural-network-and-some-of-gnn-applications

NVIDIA. (n.d.). PyTorch. Nvidia. Retrieved September 24, 2023, from https://www.nvidia.com/en-us/glossary/data-science/pytorch/

OECD. (n.d.). Introduction to (Quantitative) Structure Activity Relationships. OECD. Retrieved September 19, 2023, from https://www.oecd.org/env/ehs/risk-assessment/introductiontoquantitativestructureactivityrelationships.htm#:~:text=Structure%2DActivity%20Relationship%20(SAR),target%20property)%20of%20studied%20compounds

Pelikan, E. W. (2022). Glossary of Terms and Symbols Used in Pharmacology. Boston University. Retrieved September 17, 2023, from https://www.bumc.bu.edu/busm-pm/resources-2/glossary/#d

PyG. (n.d.). PyG Documentation. PyTorch Geometric (PyG). Retrieved September 24, 2023, from https://pytorch-geometric.readthedocs.io/en/latest/

PyTorch. (n.d.). PyTorch. PyTorch. Retrieved September 24, 2023, from https://pytorch.org/

RDKit: Open-Source Cheminformatics Software. (n.d.). RDKit. Retrieved September 24, 2023, from https://www.rdkit.org/

Sanchez-Lengeling, B., Reif, E., Pearce, A., & Wiltschko, A. B. (2021, September 2). A Gentle Introduction to Graph Neural Networks. Distill. Retrieved September 21, 2023, from https://distill.pub/2021/gnn-intro/

Tox21 Data Challenge 2014. (2014). NIH Tox21. Retrieved September 24, 2023, from https://tripod.nih.gov/tox21/challenge/about.jsp

Verboon, C. (Ed.). (2021, April). Research and Development in the Pharmaceutical Industry. Congressional Budget Office. Retrieved September 15, 2023, from https://www.cbo.gov/publication/57126

Visual Studio Code. (n.d.). [Code editing. Redefined. Free. Built on open source. Runs everywhere.]. Visual Studio Code. Retrieved September 24, 2023, from https://code.visualstudio.com/

Western Governors University [WGU]. (2020, May 26). What is deep learning? Western Governors University. Retrieved September 21, 2023, from https://www.wgu.edu/blog/what-deep-learning2005.html

Downloads

Posted

2025-06-29