Sci Rep. 2025 Nov 18;15(1):40559. doi: 10.1038/s41598-025-24245-8.
ABSTRACT
Pregnant women in rural Ethiopia face substantial barriers to accessing adequate healthcare services, contributing to adverse maternal and neonatal health outcomes. Traditional statistical approaches often fall short in capturing the complex, nonlinear interactions among the diverse factors influencing healthcare access. In contrast, machine learning (ML) techniques offer robust tools for analysing large-scale datasets, identifying hidden patterns, and generating accurate predictive insights to inform healthcare interventions. This study aimed to determine the most effective machine-learning algorithm for predicting healthcare service access among pregnant women in rural Ethiopia. Data were sourced from the Ethiopian Demographic and Health Survey (EDHS). Seven supervised ML classifiers; Gradient Boosting, Random Forest, K-Nearest Neighbors (KNN), Decision Tree, Support Vector Machine (SVM), Logistic Regression, and Naive Bayes were applied to predict determinants of healthcare access. Model performance was evaluated using accuracy and the area under the receiver operating characteristic curve (AUC). SHapley Additive exPlanations (SHAP) analysis was conducted to interpret the contribution of individual features. Gradient Boosting outperformed all other models based on its highest predictive AUC, achieving predictive accuracy (79.55%) and AUC (81.40%). Key protective (negative) factors associated with improved healthcare access included higher household wealth, residence in the Amhara region, media exposure, and alcohol avoidance. Conversely, lack of formal education emerged as a significant barrier, underscoring its critical role in limiting access to maternal health services. The superior performance of the Gradient Boosting model highlights its effectiveness in predicting healthcare access among pregnant women in rural Ethiopia. Socioeconomic status, regional residence, media exposure, and behavioural factors were linked to health service access, while lack of education remained a prominent barrier. These findings support the utility of machine learning in guiding data-driven policy and targeted interventions to enhance maternal health outcomes in resource-limited settings.
PMID:41253920 | DOI:10.1038/s41598-025-24245-8