Assessing the accuracy and reproducibility of ChatGPT for responding to patient inquiries about otosclerosis

Mete, Utku; Özmen, Ömer Afşin

Yayın:
Assessing the accuracy and reproducibility of ChatGPT for responding to patient inquiries about otosclerosis

dc.contributor.author	Mete, Utku
dc.contributor.author	Özmen, Ömer Afşin
dc.contributor.buuauthor	METE, UTKU
dc.contributor.buuauthor	ÖZMEN, ÖMER AFŞIN
dc.contributor.department	Tıp Fakültesi
dc.contributor.department	Kulak Burun Boğaz Hastalıkları Ana Bilim Dalı
dc.contributor.orcid	0000-0002-9698-0546
dc.contributor.scopusid	59149489600
dc.contributor.scopusid	55407733900
dc.date.accessioned	2025-05-12T22:31:52Z
dc.date.issued	2025-03-01
dc.description.abstract	Background: Patients increasingly use chatbots powered by artificial intelligence to seek information. However, there is a lack of reliable studies on the accuracy and reproducibility of the information provided by these models. Therefore, we conducted a study investigating the ChatGPT’s responses to questions about otosclerosis. Methods: 96 otosclerosis-related questions were collected from internet searches and websites of professional institutions and societies. Questions are divided into four sub-categories. These questions were directed at the latest version of ChatGPT Plus, and these responses were assessed by two otorhinolaryngology surgeons. Accuracy was graded as correct, incomplete, mixed, and irrelevant. Reproducibility was evaluated by comparing the consistency of the two answers to each specific question. Results: The overall accuracy and reproducibility rates of GPT-4o for correct answers were found to be 64.60% and 89.60%, respectively. The findings showed correct answers for accuracy and reproducibility for basic knowledge were 64.70% and 91.20%; for diagnosis & management, 64.0% and 92.0%; for medical & surgical treatment, 52.95% and 82.35%; and for operative risks & postoperative period, 75.0% and 90.0%, respectively. There were no significant differences found between the answers and groups in terms of accuracy and reproducibility (p = 0.073 and p = 0.752, respectively). Conclusion: GPT-4o achieved satisfactory accuracy results, except in the diagnosis & management and medical & surgical treatment categories. Reproducibility was generally high across all categories. With the audio and visual communication capabilities of GPT-4o, under the supervision of a medical professional, this model can be utilized to provide medical information and support for otosclerosis patients.
dc.identifier.doi	10.1007/s00405-024-09039-4
dc.identifier.endpage	1575
dc.identifier.issn	0937-4477
dc.identifier.issue	3
dc.identifier.scopus	2-s2.0-85207541187
dc.identifier.startpage	1567
dc.identifier.uri	https://hdl.handle.net/11452/51347
dc.identifier.volume	282
dc.indexed.scopus	Scopus
dc.language.iso	en
dc.publisher	Springer
dc.relation.journal	European Archives of Oto-Rhino-Laryngology
dc.rights	info:eu-repo/semantics/closedAccess
dc.subject	Patient information
dc.subject	Otosclerosis
dc.subject	Large language models
dc.subject	Hearing loss
dc.subject	ChatGPT
dc.subject	Artificial intelligence
dc.subject.scopus	Genetic and Imaging Insights into Otosclerosis
dc.title	Assessing the accuracy and reproducibility of ChatGPT for responding to patient inquiries about otosclerosis
dc.type	Article
dspace.entity.type	Publication
local.contributor.department	Tıp Fakültesi/Kulak Burun Boğaz Hastalıkları Ana Bilim Dalı
local.indexed.at	Scopus
relation.isAuthorOfPublication	fdadf4a0-7bbe-46b0-90b4-36275b6ddf52
relation.isAuthorOfPublication	f5faf2b2-e998-4b3c-a3a7-da845d815696
relation.isAuthorOfPublication.latestForDiscovery	fdadf4a0-7bbe-46b0-90b4-36275b6ddf52

Koleksiyonlar

İndeksli Yayınlar / Indexed Publications

Yayın: Assessing the accuracy and reproducibility of ChatGPT for responding to patient inquiries about otosclerosis

Dosyalar

Koleksiyonlar

Yayın:
Assessing the accuracy and reproducibility of ChatGPT for responding to patient inquiries about otosclerosis