Detecting Depression in Social Media with NLP Models Trained on Journal Entry Data
DOI:
https://doi.org/10.58445/rars.1632Keywords:
text classification, depression, tweetsAbstract
Writing has always been recognized as a powerful means of expressing human emotions, serving as a reflective practice that allows individuals to process and articulate their inner experiences. However, with the rise of social media, the landscape of emotional expression has shifted. This transition from private journaling to public social media posting raises important questions about how effectively these platforms serve as emotional outlets and what they reveal about users' mental health, specifically with pervasive mood disorders like depression, which affects over 18 million adults in the United States. Recently, NLP models have been noted as a promising tool for detecting underlying sentiment in text. This research explores how the Twitter posts of individuals suffering from depression compare when analyzed using a natural language processing (NLP) model trained on journal data classified by emotions. Two separate clustering approaches were used to reduce dimensionality in training data and train machine learning models, one with spectral clustering and principal component analysis (PCA), and the other with the Natural Language Toolkit (NLTK) library. The results of both machine learning approaches, with accuracy over 99%, demonstrated that tweets of depressed Twitter users are classified as more negative compared to those of non-depressed users. These findings suggest that the emotional content expressed in social media posts by individuals with depression is consistently more negative, aligning with the patterns observed in their journal entries. Ultimately, this research highlights the evolving role of social media as a platform for emotional expression and its implications for mental health monitoring.
References
Desmet, B., & Hoste, V. (2013). Emotion detection in suicide notes. Expert Systems with Applications, 40, 6351–6358. https://doi.org/10.1016/j.eswa.2013.05.050
Early Identification of Mental Health Issues in Young People. (n.d.). Mental Health America. https://mhanational.org/issues/early-identification-mental-health-issues-young-people
Facts about Depression | Hope for Depression. (2013). Hope for Depression. https://www.hopefordepression.org/depression-facts/
GeeksforGeeks. (2023, December 14). Spectral Clustering A Comprehensive Guide for Beginners. GeeksforGeeks; GeeksforGeeks. https://www.geeksforgeeks.org/spectral-clustering-a-comprehensive-guide-for-beginners/#
Hyun Ki Cho. (2021). Twitter Depression Dataset. Kaggle.com. https://www.kaggle.com/datasets/hyunkic/twitter-depression-dataset?resource=download
Mogyorosi, M. (n.d.). Sentiment Analysis: First Steps With Python’s NLTK Library – Real Python. Realpython.com. https://realpython.com/python-nltk-sentiment-analysis/
National Institute Of Mental Health. (2023, March). Depression. National Institute of Mental Health. https://www.nimh.nih.gov/health/topics/depression
Potamias, R. A., Siolas, G., & Stafylopatis, A. - G. (2020). A transformer-based approach to irony and sarcasm detection. Neural Computing and Applications, 32(23), 17309–17320. https://doi.org/10.1007/s00521-020-05102-3
Principal Component Analysis (PCA) Explained | Built In. (n.d.). Builtin.com. https://builtin.com/data-science/step-step-explanation-principal-component-analysis#:~:text=necessary%20for%20context.-
Sohal, M., Singh, P., Dhillon, B. S., & Gill, H. S. (2022). Efficacy of journaling in the management of mental illness: a systematic review and meta-analysis. Family medicine and community health, 10(1), e001154. https://doi.org/10.1136/fmch-2021-001154
X. Alice Li, & Parikh, D. (2020). Lemotif: An affective visual journal using deep neural networks. https://arxiv.org/abs/1903.07766
Downloads
Posted
Categories
License
Copyright (c) 2024 Tvisha Choubey
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.