Cureus. 2025 Apr 15;17(4):e82300. doi: 10.7759/cureus.82300. eCollection 2025 Apr.
ABSTRACT
Background Periventricular-intraventricular hemorrhage (PV-IVH) is a common complication in very preterm infants (VPIs) and remains a significant cause of neonatal morbidity and long-term neurological impairment. Cranial ultrasound (CUS) is the standard bedside tool for early detection. This study aimed to explore the potential of ChatGPT-4o (OpenAI, San Francisco, USA), an artificial intelligence model, in interpreting cranial ultrasound images to assist in the diagnosis of PV-IVH. Method A cross-sectional study was conducted on 35 very preterm infants in a neonatal intensive care unit in Vietnam. The final cranial ultrasound (CUS) images, including coronal and sagittal views, were obtained within the first two weeks. Standardized coronal views through the anterior fontanelle were routinely acquired for optimal visualization, with sagittal views added as needed. The images were analyzed using the ChatGPT-4o model with a standardized diagnostic prompt and compared to interpretations by pediatric radiologists. Results From September 2024 to March 2025, 35 VPIs were screened for PV-IVH, of whom 16 cases (45.7%) were diagnosed with PV-IVH and 19 cases (54.3%) were not. Infants with PV-IVH required more intensive resuscitation, eight cases (50%) received positive pressure ventilation, and seven cases (43.8%) required intubation. The median postnatal age at PV-IVH detection was 10 days (interquartile range: 3.5 to 13.8 days). ChatGPT-4o correctly identified 12 out of 16 PV-IVH cases (75%) and misclassified four cases (25%) as false negatives, while accurately classifying 16 out of 19 non-PV-IVH cases (84.2%). The model achieved an area under the curve (AUC) of 0.796, with a positive likelihood ratio of 4.75 and moderate inter-rater agreement with pediatric radiologists (κ = 0.595, p <, 0.001). Conclusions The findings highlight the potential of accessible ChatGPT-4o in aiding early screening for PV-IVH in resource-limited settings. The model showed moderate diagnostic performance and fair-to-good agreement with specialists. However, further large-scale studies are needed.
PMID:40376373 | PMC:PMC12080620 | DOI:10.7759/cureus.82300