e-ISSN 2459-1726
Artificial intelligence driven dental trauma assessment: Comparing the performance of chatbot models [Turk Endod J]
Turk Endod J. 2025; 10(2): 109-115 | DOI: 10.14744/TEJ.2025.29200

Artificial intelligence driven dental trauma assessment: Comparing the performance of chatbot models

İdil Özden1, Melike Beyza Kaplanoğlu1, Merve Gökyar1, Mustafa Enes Özden2, Hesna Sazak Öveçoğlu1
1Department of Endodontics, Marmara University Faculty of Dentistry, Istanbul, Türkiye
2Republic of Türkiye Ministry of Health Kahramankazan District Health Administration, Ankara, Türkiye

Purpose: This study aimed to compare the accuracy and reliability of four chatbot applications—Chat-GPT o1, Google Gemini Advanced, DeepSeek R1, and Perplexity AI—in the context of dental traumatology.
Methods: Twenty-five dichotomous questions, derived from the 2020 guidelines of the International Association of Dental Traumatology (IADT), were administered by three independent researchers to each chatbot over a 10-day period. Each question was asked three times per day, generating 90 responses per question. Responses were categorised as “correct,” “incorrect,” or “refer to a practitioner.” Accuracy rates and Fleiss’ Kappa values were calculated to assess performance and interresponse reliability.
Results: All chatbot models demonstrated high levels of accuracy. ChatGPT o1 yielded the highest accuracy rate (86.4%), followed by DeepSeek (84.0%), Perplexity (80.5%), and Google Gemini Advanced (80.2%). The highest Fleiss’ Kappa value was observed in the DeepSeek model (0.709), indicating the greatest internal consistency, while the Google Gemini Advanced model recorded the lowest value (0.185). Although DeepSeek and Perplexity exhibited relatively stronger reliability metrics, none of the models achieved complete consistency, with intra-platform variation occasionally present.
Conclusion: Contemporary chatbot models show substantial accuracy and improving reliability in responding to dental traumatology queries, suggesting their potential as clinical support tools. Nonetheless, further refinement and domain-specific optimisation remain necessary.

Keywords: Accuracy, artificial intelligence, chatbot, dental traumatology, reliability

Corresponding Author: İdil Özden, Türkiye
Manuscript Language: English
×
APA
NLM
AMA
MLA
Chicago
Copied!
CITE
LookUs & Online Makale