Folia Med (Plovdiv). 2025 Aug 14;67(4). doi: 10.3897/folmed.67.e154338.
ABSTRACT
This study aimed to evaluate the performance of three large language models (LLMs)-ChatGPT-4.0, Claude 3.5 Sonnet, and DeepSeek R1-in answering multiple-choice questions (MCQs) related to pediatric dentistry. Accuracy and justification quality were analyzed using Bloom’s taxonomy.
PMID:40884138 | DOI:10.3897/folmed.67.e154338