Using Machine Learning for the Automated Segmentation and Detection of Swallows Obtained by Digital Cervical Auscultation in Preterm Neonates

Dysphagia. 2025 Sep 12. doi: 10.1007/s00455-025-10879-3. Online ahead of print.

ABSTRACT

The clinical application of acoustic swallowing sound parameters collected from digital cervical auscultation is limited because of the time-consuming manual segmentation required by trained experts. The automated identification of swallowing sounds in children and adults from swallowing sound audio wavefiles using machine learning have accuracies between 76-95%. No data exists in preterm neonates. To determine if applying automated machine learning using a transfer learning approach could accurately identify and segment swallows from swallowing sounds collected in preterm neonates. Thin fluid swallow sounds were collected from 78 preterm neonates, median birth age 34 weeks gestation (range 25-36 weeks, 52.6% males) across 3 Australian special care nurseries. For the base machine learning model, a deep convolutional neural network (DCNN) pre-trained for audio event classification was used. With raw swallow audio data as input, embedding vectors from the base DCNN were generated and used to train a feedforward neural network to determine the presence of a swallow within an audio segment. The model showed high overall accuracy (94%) in identifying preterm swallows. Better model performance on bottle feeding swallows (Sensitivity, 95%; and specificity, 96%) was seen compared with breastfeeding swallows (sensitivity, 95%, specificity 92%). Interpretation: Our novel study demonstrates the successful use of transfer learning to accurately identify and segment digital swallowing sounds in preterm neonates. Application of this model could support the development of a digital CA app to automatically classify swallow sounds and improve objectivity for CA use in clinical practice within special care nurseries.

PMID:40936063 | DOI:10.1007/s00455-025-10879-3

John Joseph