Unmasking Misinformation: A Machine Learning Approach to Detecting Fake News
DOI:
https://doi.org/10.58445/rars.3224Keywords:
AI, Artificial Intelligence, Misinformation, Fake News DetectionAbstract
In recent times, there has been an increase in misinformation, with misleading information being shared as real news to deceive and manipulate public opinion. The dissemination of misinformation, particularly in areas with global implications such as politics and health, can have severe consequences for society as a whole. For example, recent US elections related widespread misinformation has shown to deepen polarization and erode trust in both democratic institutions and our news media. Misleading reports during crises like the Ebola outbreak or COVID-19 misinformation about vaccines and treatments spread unnecessary fear, created barriers for public health response teams, and resulted in many preventable deaths. Social media further amplifies fake news, making it difficult for fact-checking efforts to keep pace. To distinguish misinformation from credible reporting, this paper aims to apply machine learning techniques to detect fake news with greater accuracy. To research this, we analyze datasets containing both fake and real news articles to uncover linguistic patterns and differences between the two. Natural Language Processing (NLP) techniques such as Term Frequency-Inverse Document Frequency (TF-IDF) are used to convert text data into numerical features for training machine learning models. Several classification algorithms, such as Logistic Regression, Random Forest, and XGBoost, are then trained to differentiate fake from real news. To further explore the differences in the data types, an analysis is done to examine sentiment differences. By leveraging data from everyday news, politics, and health sources, we keep the work grounded in the real-world implications of fake news disguised as fact. The goal is to develop an AI-powered automated fact-checking system to distinguish between real and fake sources, thereby contributing to ongoing efforts to protect the public from the harms of misinformation and uphold their trust in news media.
References
Wardle, C., & Derakhshan, H. (2017). Information disorder: Toward an interdisciplinary framework for research and policymaking. Council of Europe. https://rm.coe.int/information-disorder-report-version-august-2018/16808c9c77
Neely, S. R., Eldredge, C., & Ersing, R. (2022). Vaccine hesitancy and exposure to misinformation: A survey analysis. Journal of General Internal Medicine, 37(1), 179–187. https://doi.org/10.1007/s11606-021-07171-z
Li, H. O., Bailey, A., Huynh, D., & Chan, J. (2020). YouTube as a source of information on COVID-19: A pandemic of misinformation? BMJ Global Health, 5(5), e002604. https://doi.org/10.1136/bmjgh-2020-002604
Graves, L. (2018). Understanding the promise and limits of automated fact-checking. Oxford University Research Archive. https://ora.ox.ac.uk/objects/uuid:f321ff43-05f0-4430-b978-f5f517b73b9b
Choudhary, A., & Arora, A. (2021). Linguistic feature based learning model for fake news detection and classification. Expert Systems with Applications, 169, Article 114171. https://doi.org/10.1016/j.eswa.2020.114171
Indu, V., and Sabu M. Thampi. "Misinformation detection in social networks using emotion analysis and user behavior analysis." Pattern Recognition Letters 182 (2024): 60-66.
Huang, Y., & Chen, P. (2020). Fake news detection using an ensemble learning model based on self-adaptive harmony search algorithms. Expert Systems with Applications, 159, Article 113584. https://doi.org/10.1016/j.eswa.2020.113584
Rubin, V. L., Conroy, N., Chen, Y., & Cornwell, S. (2016). Fake news or truth? Using satirical cues to detect potentially misleading news. Proceedings of the Second Workshop on Computational Approaches to Deception Detection, 7–17. https://doi.org/10.18653/v1/W16-0802
Ma, J., Gao, W., Mitra, P., Zhou, J., & Wong, K.-F. (2016). Detecting rumors using time-aware propagated network embeddings. Proceedings of the 30th AAAI Conference on Artificial Intelligence (AAAI-16), 3042–3048. https://doi.org/10.1609/aaai.v30i1.10310
Yang, Y., Zheng, L., Zhang, J., Chen, Q., Zhao, Y., & Sun, Y. (2018). TI-CNN: Convolutional neural networks for fake news detection. arXiv. https://doi.org/10.48550/arXiv.1806.00749
Wang, Y., Ma, F., Jin, Z., Yuan, Y., Xun, G., Jha, K., Su, L., & Gao, J. (2018). EANN: Event adversarial neural networks for multi-modal fake news detection. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (pp. 849–857). Association for Computing Machinery. https://doi.org/10.1145/3219819.3219892
Hutto, C., and E. Gilbert. “VADER: A Parsimonious Rule-Based Model for Sentiment Analysis of Social Media Text”. Proceedings of the International AAAI Conference on Web and Social Media, vol. 8, no. 1, May 2014, pp. 216-25, doi:10.1609/icwsm.v8i1.14550.
Bisaillon, C. (2020). Fake and real news dataset [Data set]. Kaggle. https://www.kaggle.com/datasets/clmentbisaillon/fake-and-real-news-dataset
Cui, L., & Lee, D. (2020). CoAID: COVID-19 healthcare misinformation dataset. arXiv. https://doi.org/10.48550/arXiv.2006.00885
Wang, W. Y. (2017). "Liar, liar pants on fire": A new benchmark dataset for fake news detection. arXiv. https://doi.org/10.48550/arXiv.1705.00648
Sanchez, G. R., & Middlemass, K. (2022, July 26). Misinformation is eroding the public’s confidence in democracy. Brookings Institution. https://www.brookings.edu/articles/misinformation-is-eroding-the-publics-confidence-in-democracy/
Yasir, M., & Uwishema, O. (2021). Ebola outbreak amid COVID-19 in the Republic of Guinea: Priorities for achieving control. The American Journal of Tropical Medicine and Hygiene, 105 (2), 287–289. https://doi.org/10.4269/ajtmh.21-0228
Vosoughi, S., Roy, D., & Aral, S. (2018). The spread of true and false news online. Science, 359 (6380), 1146–1151. https://doi.org/10.1126/science.aap9559
Shu, K., Sliva, A., Wang, S., Tang, J., & Liu, H. (2017). Fake news detection on social media: A data mining perspective. ACM SIGKDD Explorations Newsletter, 19 (1), 22–36. https://doi.org/10.1145/3137597.3137600
Skafle, I., Nordahl-Hansen, A., Steinsbekk, S., & Engebretsen, E. (2022). Misinformation about COVID-19 vaccines on social media: Rapid review. Journal of Medical Internet Research, 24 (8), Article e37367. https://doi.org/10.2196/37367
Cui, L., & Lee, D. (2020). CoAID: COVID-19 healthcare misinformation dataset. arXiv. https://doi.org/10.48550/arXiv.2006.00885
Gifu, D. (2023). An intelligent system for detecting fake news. Procedia Computer Science, 221, 1058–1065. https://doi.org/10.1016/j.procs.2023.08.088
Downloads
Posted
Categories
License
Copyright (c) 2025 Maya Hussain

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.