Machine learning-based predictive modeling of depressive symptoms in Chinese adolescents
Machine learning-based predictive modeling of depressive symptoms in Chinese adolescents

Machine learning-based predictive modeling of depressive symptoms in Chinese adolescents

J Affect Disord. 2025 May 12:119399. doi: 10.1016/j.jad.2025.119399. Online ahead of print.

ABSTRACT

BACKGROUND: The aim is to develop prediction models by lifestyles indicators as well as socioeconomic status to predict the risk of depressive symptoms in adolescents, and to rank and explain these predictors.

METHOD: A cross-sectional study was conducted in 32389 school students grade 4-12. A self-rating depression scale was used to define depressive symptoms (CES-D score ≥ 16), and lifestyle survey was used to investigate risk factors of depressive symptoms. Boruta-RF algorithm was used for feature selection and to rank variable importance. Random forest model was constructed to predict the risk of depressive symptom, and partial dependence plot (PDP) was used to explain the relationship between each variable and predicted outcome.

RESULTS: Boruta-RF algorithm showed that self-rated health, sleep duration, parental support for physical exercise, breakfast intake, screen time, skipping physical education classes, egg intake, grade, milk/soy product intake, and parental exercise habits were the top ten most important factors for depressive symptoms. The AUC of the random forest model was 0.829 (95% CI: 0.820 – 0.837), suggesting good accuracy for predicting depressive symptoms. Additionally, we demonstrated the nonlinear effect of each predictor for predicting risk of depressive symptoms by PDP.

CONCLUSIONS: The prediction model, using lifestyle indicators routinely collected in schools, can effectively screen for high-risk individuals needing further mental health evaluations, and facilitate early detection of depressive symptoms in adolescents. The study is limited by its cross-sectional design implying causality, use of CES-D for depressive symptoms rather than clinical diagnosis, and omission of neuroimaging biomarkers for improved accuracy.

PMID:40368147 | DOI:10.1016/j.jad.2025.119399