Knowledge-level comparison in pulpal and periapical diseases: dental students versus artificial intelligence models (Gemini, Microsoft Copilot, ChatGPT-3.5, ChatGPT-4o): cross-sectional study

Şimşek, Emine

Görüntülenme 2
İndirme 0
Google Akademik

Yazarlar	Şimşek, Emine
Kurum Dışı Yazarlar	Kurt, Özge
Tek Biçim Adres (URI)	https://hdl.handle.net/20.500.14114/8096
Yayın Türü	Makale
Yayın Yılı	2025
Yayıncı	Springer Nature
Dergi Adı	BMC Medical Education
Konu Başlıkları	Artificial intelligence Dental students Education Dental Large language models periapical periodontitis, Pulp disease
İndekslenen Platformlar	Web of Science

Background This study explored the diagnostic accuracy of artificial intelligence (AI) chatbots and dental students
when responding to questions related to pulpal and periapical diseases. Rapid advancements in AI have led to
increased interest in their applicability to clinical education and decision-making in dentistry.
Objective To compare the accuracy rates of responses given by dental students and various AI-based chatbots
(ChatGPT-3.5, ChatGPT-4o, Gemini, and Microsoft Copilot) to multiple-choice questions designed to assess knowledge
related to pulpal and periapical diseases.
Methods The study included third- and fifth-year dental students representing different levels of clinical training,
along with four distinct AI-based chatbots. A total of 327 responses were collected from students, while each chatbot
generated 450 responses. The evaluation was based on 15 multiple-choice questions developed in accordance
with the 2020 version of the American Association of Endodontists (AAE) clinical guidelines. The accuracy rates of
the groups were compared using descriptive statistics, one-way ANOVA, Bonferroni post hoc tests for significant
differences, and Chi-square tests for correct versus incorrect response ratios.
Results The highest accuracy rate was observed among fifth-year dental students (85.1%), followed by ChatGPT-4o
(79.6%), ChatGPT-3.5 (75.1%), Gemini (71.6%), third-year students (64.9%), and Microsoft Copilot (61.3%). A statistically
significant difference was found among the groups (p < 0.05). ChatGPT-4o demonstrated a comparable accuracy rate
to fifth-year students with more clinical experience (p > 0.05), whereas other chatbots and third-year students showed
lower performance.
Conclusion Chatbots exhibited varying levels of accuracy in diagnosing pulpal and periapical diseases. ChatGPT-4o
performed at a level similar to that of more clinically experienced students, suggesting its potential as a supportive
tool in dental education and clinical decision support systems. However, the relatively lower accuracy rates of modelssuch as Gemini and Microsoft Copilot underscore the continued importance of human expertise. These findings
suggest that while AI systems may serve as complementary tools in education, they cannot fully replace clinical
judgment grounded in human experience.

Koleksiyonlar

Fakülteler
Diş Hekimliği Fakültesi
Klinik Bilimler Bölümü
Endodonti Anabilim Dalı

Eser Adı dc.title	Knowledge-level comparison in pulpal and periapical diseases: dental students versus artificial intelligence models (Gemini, Microsoft Copilot, ChatGPT-3.5, ChatGPT-4o): cross-sectional study
Yazarlar dc.contributor.author	Şimşek, Emine
Kurum Dışı Yazarlar dc.contributor.other	Kurt, Özge
Yayıncı dc.publisher	Springer Nature
Yayın Türü dc.type	Makale
Özet dc.description.abstract	Background This study explored the diagnostic accuracy of artificial intelligence (AI) chatbots and dental students when responding to questions related to pulpal and periapical diseases. Rapid advancements in AI have led to increased interest in their applicability to clinical education and decision-making in dentistry. Objective To compare the accuracy rates of responses given by dental students and various AI-based chatbots (ChatGPT-3.5, ChatGPT-4o, Gemini, and Microsoft Copilot) to multiple-choice questions designed to assess knowledge related to pulpal and periapical diseases. Methods The study included third- and fifth-year dental students representing different levels of clinical training, along with four distinct AI-based chatbots. A total of 327 responses were collected from students, while each chatbot generated 450 responses. The evaluation was based on 15 multiple-choice questions developed in accordance with the 2020 version of the American Association of Endodontists (AAE) clinical guidelines. The accuracy rates of the groups were compared using descriptive statistics, one-way ANOVA, Bonferroni post hoc tests for significant differences, and Chi-square tests for correct versus incorrect response ratios. Results The highest accuracy rate was observed among fifth-year dental students (85.1%), followed by ChatGPT-4o (79.6%), ChatGPT-3.5 (75.1%), Gemini (71.6%), third-year students (64.9%), and Microsoft Copilot (61.3%). A statistically significant difference was found among the groups (p < 0.05). ChatGPT-4o demonstrated a comparable accuracy rate to fifth-year students with more clinical experience (p > 0.05), whereas other chatbots and third-year students showed lower performance. Conclusion Chatbots exhibited varying levels of accuracy in diagnosing pulpal and periapical diseases. ChatGPT-4o performed at a level similar to that of more clinically experienced students, suggesting its potential as a supportive tool in dental education and clinical decision support systems. However, the relatively lower accuracy rates of modelssuch as Gemini and Microsoft Copilot underscore the continued importance of human expertise. These findings suggest that while AI systems may serve as complementary tools in education, they cannot fully replace clinical judgment grounded in human experience.
Kayıt Giriş Tarihi dc.date.accessioned	2025-12-26
Yayın Yılı dc.date.issued	2025
Açık Erișim Tarihi dc.date.available	2025-12-26
Dil dc.language.iso	eng
Konu Başlıkları dc.subject	Artificial intelligence
Konu Başlıkları dc.subject	Dental students
Konu Başlıkları dc.subject	Education
Konu Başlıkları dc.subject	Dental
Konu Başlıkları dc.subject	Large language models
Konu Başlıkları dc.subject	periapical periodontitis,
Konu Başlıkları dc.subject	Pulp disease
ISSN dc.identifier.issn	1472-6920
İlk Sayfa dc.identifier.startpage	-
Son Sayfa dc.identifier.endpage	-
Makale Numarası dc.identifier.articlenumber	-
Dergi Adı dc.relation.journal	BMC Medical Education
Dergi Sayısı dc.identifier.issue	1657
Dergi Cilt dc.identifier.volume	25
Tek Biçim Adres (URI) dc.identifier.uri	https://hdl.handle.net/20.500.14114/8096
İndekslenen Platformlar dc.source.database	Web of Science

1 dosya var

Süresiz Ambargo