Early childhood caries risk prediction using machine learning approaches in Bangladesh
Early childhood caries risk prediction using machine learning approaches in Bangladesh

Early childhood caries risk prediction using machine learning approaches in Bangladesh

BMC Oral Health. 2025 Jan 8;25(1):49. doi: 10.1186/s12903-025-05419-2.

ABSTRACT

BACKGROUND: In the last years, artificial intelligence (AI) has contributed to improving healthcare including dentistry. The objective of this study was to develop a machine learning (ML) model for early childhood caries (ECC) prediction by identifying crucial health behaviours within mother-child pairs.

METHODS: For the analysis, we utilized a representative sample of 724 mothers with children under six years in Bangladesh. The study utilized both clinical and survey data. ECC was assessed using ICDAS II criteria in the clinical examinations. Recursive Feature Elimination (RFE) and Random Forest (RF) was applied to identify the optimal subsets of features. Random forest classifier (RFC), extreme gradient boosting (XGBoost), support vector machine (SVM), adaptive boosting (AdaBoost), and multi-layer perceptron (MLP) models were used to identify the best fitted model as the predictor of ECC. SHAP and MDG-MDA plots were visualized for model interpretability and identify significant predictors.

RESULTS: The RFC model identified 10 features as the most relevant for ECC prediction obtained by RFE feature selection method. The features were: plaque score, age of child, mother’s education, number of siblings, age of mother, consumption of sweet, tooth cleaning tools, child’s tooth brushing frequency, helping child brushing, and use of F-toothpaste. The final ML model achieved an AUC-ROC score (0.77), accuracy (0.72), sensitivity (0.80) and F1 score (0.73) in the test set. Of the prediction model, dental plaque was the strongest predictor of ECC (MDG: 0.08, MDA: 0.10).

CONCLUSIONS: Our final ML model, integrating 10 key features, has the potential to predict ECC effectively in children under five years. Additional research is needed for validation and optimization across various groups.

PMID:39780148 | DOI:10.1186/s12903-025-05419-2