Environ Health Perspect. 2024 Jun;132(6):67005. doi: 10.1289/EHP13838. Epub 2024 Jun 17.
ABSTRACT
BACKGROUND: Maternal cigarette smoking during pregnancy (MSDP) is associated with numerous adverse health outcomes in infants and children with potential lifelong consequences. Negative effects of MSDP on placental DNA methylation (DNAm), placental structure, and function are well established.
OBJECTIVE: Our aim was to develop biomarkers of MSDP using DNAm measured in placentas (
METHODS: We compared the ability of four machine learning methods [logistic least absolute shrinkage and selection operator (LASSO) regression, logistic elastic net regression, random forest, and gradient boosting machine] to classify MSDP based on placental DNAm signatures. We developed separate models using the complete EPIC array dataset and on the subset of probes also found on the 450K array so that models exist for both platforms. For comparison, we developed a model using CpGs previously associated with MSDP in placenta. For each final model, we used model coefficients and normalized beta values to calculate placental smoking index (PSI) scores for each sample. Final models were validated in two external datasets: the Extremely Low Gestational Age Newborn observational study,
RESULTS: Logistic LASSO regression demonstrated the highest performance in cross-validation testing with the lowest number of input CpGs. Accuracy was greatest in external datasets when using models developed for the same platform. PSI scores in smokers only (
DISCUSSION: To our knowledge, we have developed the first placental DNAm-based biomarkers of MSDP with broad utility to studies of prenatal disease origins. https://doi.org/10.1289/EHP13838.
PMID:38885141 | DOI:10.1289/EHP13838