Evaluating AI performance in pediatric surgery: temporal bias and multimodal limitations in large language model assessment
Evaluating AI performance in pediatric surgery: temporal bias and multimodal limitations in large language model assessment