Predicting Student Depression Using the Naive Bayes Model on the Student Depression Dataset from Kaggle
Main Article Content
Abstract
Background of Study: The increasing prevalence of depression among college students highlights the urgent need for effective early detection strategies to promote mental well-being within higher education environments.
Aims and Scope of Paper: This study aims to develop a predictive model for student depression using the Naive Bayes classification algorithm, with a focus on identifying key contributing factors from student-related data.
Methods: The research utilizes the Student Depression dataset from Kaggle, containing structured survey data on academic stress, sleep duration, financial stress, GPA, and family mental health history. Data preprocessing included feature selection, handling of missing values, and normalization. The dataset was split into training and testing sets at a 75:25 ratio. Model training was conducted using the R programming language with the application of Laplace smoothing.
Result: The Naive Bayes model achieved an accuracy of 77.66%, a specificity of 84.21%, and a sensitivity of 68.42%, indicating strong predictive performance, particularly in identifying depressive cases. Financial and academic stress were identified as the most influential factors.
Conclusion: Despite its simplicity, the Naive Bayes algorithm proves to be an effective tool for initial screening of students at risk of depression, offering valuable support for educational institutions in delivering timely mental health interventions.
Downloads
Article Details
Copyright (c) 2025 Rebina Putri Sonjaya, Andre Rangga Gintara, Lala Septem Riza, Muhammad Nursalman, Eki Nugraha, Didin Wahyudin

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
References
Adlaf, E. M., Gliksman, L., Demers, A., & Newton-Taylor, B. (2014). The prevalence of elevated psychological distress among canadian undergraduates: Findings from the 1998 Canadian campus survey. Journal of the American College Health Association, 50(2), 67–72. https://doi.org/10.1080/07448480109596009
Agarwal, P., Mundaragi, R. S., Kohad, R. S., Allada, R., Bharadwaj, S. R., & Shobha, T. (2025). Predicting Student Depression Using Machine Learning. International Journal of Innovative Science and Research Technology, 10(1), 940–945. https://doi.org/10.5281/zenodo.14737958
Alavi, M., Le Lagadec, D., & Cleary, M. (2025). Challenges of Cross-Cultural Validation of Clinical Assessment Measures: A Practical Introduction. Journal of Advanced Nursing, 1–9. https://doi.org/10.1111/jan.16906
Alqurashi, Y. D., Al Qattan, A. H., Al Abbas, H. E., Alghamdi, M. A., Alhamad, A. A., Al-Dalooj, H. A., Yar, T., Al Khathlan, N. A., Alqarni, A. S., & Salem, A. M. (2022). Association of sleep duration and quality with depression among university students and faculty. Acta Biomedica, 93(5), 1–8. https://doi.org/10.23750/abm.v93i5.13002
Atlam, E. S., Rokaya, M., Masud, M., Meshref, H., Alotaibi, R., Almars, A. M., Assiri, M., & Gad, I. (2025). Explainable artificial intelligence systems for predicting mental health problems in autistics. Alexandria Engineering Journal, 117(December 2024), 376–390. https://doi.org/10.1016/j.aej.2024.12.120
Berrar, D. (2018). Cross-validation. Encyclopedia of Bioinformatics and Computational Biology: ABC of Bioinformatics, 1–3, 542–545. https://doi.org/10.1016/B978-0-12-809633-8.20349-X
Campbell, F., Blank, L., Cantrell, A., Baxter, S., Blackmore, C., Dixon, J., & Goyder, E. (2022). Factors that influence mental health of university and college students in the UK: a systematic review. BMC Public Health, 22(1), 1–22. https://doi.org/10.1186/s12889-022-13943-x
Chikersal, P., Doryab, A., Tumminia, M., Villalba, D. K., Dutcher, J. M., Liu, X., Cohen, S., Creswell, K. G., Mankoff, J., David Creswell, J., Goel, M., & Dey, A. K. (2021). Detecting depression and predicting its onset using longitudinal symptoms captured by passive sensing: A machine learning approach with robust feature selection. ACM Transactions on Computer-Human Interaction, 28(1), 1–41. https://doi.org/10.1145/3422821
Cruz, T. F., Flores, E. E. C., & Quispe, S. J. C. (2023). Prediction of Depression Level in University Students through a Naive Bayes based Machine Learning Model. ArXiv Preprint, 1. https://doi.org/10.48550/arXiv.2307.14371
Danahy, R., Loibl, C., Montalto, C. P., & Lillard, D. (2024). Financial stress among college students: New data about student loan debt, lack of emergency savings, social and personal resources. Journal of Consumer Affairs, 58(2), 692–709. https://doi.org/10.1111/joca.12581
Dinis, J., & Bragança, M. (2018). Quality of sleep and depression in college students: A systematic review. Sleep Science, 11(4), 290–301. https://doi.org/10.5935/1984-0063.20180045
Gil, M., Kim, S. S., & Min, E. J. (2022). Machine learning models for predicting risk of depression in Korean college students: Identifying family and individual factors. Frontiers in Public Health, 10. https://doi.org/10.3389/fpubh.2022.1023010
Haque, U. M., Kabir, E., & Khanam, R. (2021). Detection of child depression using machine learning methods. PLoS ONE, 16(12 December 2021), 1–13. https://doi.org/10.1371/journal.pone.0261131
Hatton, C. M., Paton, L. W., McMillan, D., Cussens, J., Gilbody, S., & Tiffin, P. A. (2019). Predicting persistent depressive symptoms in older adults: A machine learning approach to personalised mental healthcare. Journal of Affective Disorders, 246(September 2018), 857–860. https://doi.org/10.1016/j.jad.2018.12.095
Jin, Y. (2025). Decision Tree-Based Modeling in Mental Health Early Warning System for Higher Education Students. Journal of Combinatorial Mathematics and Combinatorial Computing, 127b(July 2024), 1013–1034. https://doi.org/10.61091/jcmcc127b-057
Kustimah, K., Hanifah, H., Devy Kumalasari, A., & Meindy, N. (2023). When do College Students Seek Mental Health Services? The Open Psychology Journal, 16, 1–6. https://doi.org/10.2174/18743501-v16-e230420-2022-112
Lee, Y., Ragguett, R. M., Mansur, R. B., Boutilier, J. J., Rosenblat, J. D., Trevizol, A., Brietzke, E., Lin, K., Pan, Z., Subramaniapillai, M., Chan, T. C. Y., Fus, D., Park, C., Musial, N., Zuckerman, H., Chen, V. C. H., Ho, R., Rong, C., & McIntyre, R. S. (2018). Applications of machine learning algorithms to predict therapeutic outcomes in depression: A meta-analysis and systematic review. Journal of Affective Disorders, 241(March), 519–532. https://doi.org/10.1016/j.jad.2018.08.073
Liu, Y., Yu, H., Shi, Y., & Ma, C. (2023). The effect of perceived stress on depression in college students: The role of emotion regulation and positive psychological capital. Frontiers in Psychology, 14(March), 1–10. https://doi.org/10.3389/fpsyg.2023.1110798
Lorentzen, V., Handegård, B. H., Moen, C. M., Solem, K., Lillevoll, K., & Skre, I. (2020). CORE-OM as a routine outcome measure for adolescents with emotional disorders: Factor structure and psychometric properties. BMC Psychology, 8(1), 1–14. https://doi.org/10.1186/s40359-020-00459-5
Moon, N. N., Mariam, A., Sharmin, S., Islam, M. M., Nur, F. N., & Debnath, N. (2021). Machine learning approach to predict the depression in job sectors in Bangladesh. Current Research in Behavioral Sciences, 2(May), 100058. https://doi.org/10.1016/j.crbeha.2021.100058
Rony, J. H., Syeed, M. M. M., Khan, R. H., Fatema, K., Hossain, M. S., & Uddin, M. F. (2024). Predicting Depression Among University Students: A Comparative Assessment of ML & DL models using XAI. Conference Proceeding - 23rd International Symposium on Communications and Information Technologies, ISCIT 2024, December, 175–180. https://doi.org/10.1109/ISCIT63075.2024.10793544
Saeidnia, H. R., Hashemi Fotami, S. G., Lund, B., & Ghiasi, N. (2024). Ethical Considerations in Artificial Intelligence Interventions for Mental Health and Well-Being: Ensuring Responsible Implementation and Impact. Social Sciences, 13(7). https://doi.org/10.3390/socsci13070381
Sayed, T. A., Mahmoud, O. A. A., & Hadad, S. (2022). Early versus late onset depression: sociodemographic and clinical characteristics. Middle East Current Psychiatry, 29(1). https://doi.org/10.1186/s43045-022-00227-8
Shatte, A. B. R., Hutchinson, D. M., & Teague, S. J. (2019). Machine learning in mental health: A scoping review of methods and applications. Psychological Medicine, 49(9), 1426–1448. https://doi.org/10.1017/S0033291719000151
Shin, J., Moon, H., Chun, C.-J., Sim, T., Kim, E., & Lee, S. (2024). Enhanced Data Processing and Machine Learning Techniques for Energy Consumption Forecasting. Electronics, 13(19), 1–27. https://doi.org/10.3390/electronics13193885
Stiglic, G., Kocbek, P., Fijacko, N., Zitnik, M., Verbert, K., & Cilar, L. (2020). Interpretability of machine learning-based prediction models in healthcare. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 10(5), 1–12. https://doi.org/10.1002/widm.1379
Windarwati, H. D., Lestari, R., Wicaksono, S. A., Kusumawati, M. W., Ati, N. A. L., Ilmy, S. K., Sulaksono, A. D., & Susanti, D. (2022). Relationship between stress, anxiety, and depression with suicidal ideation in adolescents. Jurnal Ners, 17(1), 36–41. https://doi.org/10.20473/jn.v17i1.31216