Yayın:
Performance of several large language models when answering common patient questions about type 1 diabetes in children: Accuracy, comprehensibility and practicality

dc.contributor.buuauthorEREN, ERDAL
dc.contributor.buuauthorDENKBOY ÖNGEN, YASEMİN
dc.contributor.buuauthorAYDIN, AYLA İREM
dc.contributor.buuauthorATAK, MERYEM
dc.contributor.departmentTıp Fakültesi
dc.contributor.departmentİç Hastalıkları Ana Bilim Dalı
dc.contributor.orcid0000-0002-8387-9959
dc.contributor.scopusid60103010000
dc.contributor.scopusid57195728586
dc.contributor.scopusid57215867086
dc.date.accessioned2025-11-28T08:04:26Z
dc.date.issued2025-12-01
dc.description.abstractBackground: The use of large language models (LLMs) in healthcare has expanded significantly with advances in natural language processing. Models, such as ChatGPT and Google Gemini, are increasingly used to generate human-like responses to questions, including those posed by patients and their families. With the rise in the incidence of type 1 diabetes (T1D) among children, families frequently seek reliable answers regarding the disease. Previous research has focused on type 2 diabetes, but studies on T1D in a pediatric population remain limited. This study aimed to evaluate and compare the performance and effectiveness of different LLMs when answering common questions about T1D. Methods: This cross-sectional, comparative study used questions frequently asked by children with T1D and their parents. Twenty questions were selected from inquiries made to pediatric endocrinologists via social media. The performance of ChatGPT-3.5 ChatGPT-4 ChatGPT-4o was assessed using a standard prompt for each model. The responses were evaluated by five pediatric endocrinologists interested in diabetes using the General Quality Scale (GQS), a 5-point Likert scale, assessing factors such as accuracy, language simplicity, and empathy. Results: All five LLMs responded to the 20 selected questions, with their performance evaluated by GQS scores. ChatGPT-4o had the highest mean score (3.78 ± 1.09), while Gemini had the lowest (3.40 ± 1.24). Despite these differences, no significant variation was observed between the models (p = 0.103). However, ChatGPT-4o, ChatGPT-4, and Gemini Advanced produced the highest-quality answers compared to ChatGPT-3.5 and Gemini, scoring consistently between 3 and 4 points. ChatGPT-3.5 had the smallest variation in response quality, indicating consistency but not reaching the higher performance levels of other models. Conclusions: This study demonstrated that all evaluated LLMs performed similarly in answering common questions about T1D. LLMs such as ChatGPT-4o and Gemini Advanced can provide above-average, accurate, and patient-friendly answers to common questions about T1D. Although no significant differences were observed, the latest versions of LLMs show promise for integration into healthcare, provided they continue to be evaluated and improved. Further research should focus on developing specialized LLMs tailored for pediatric diabetes care.
dc.identifier.doi10.1186/s12887-025-05945-6
dc.identifier.issue1
dc.identifier.scopus2-s2.0-105018274518
dc.identifier.urihttps://hdl.handle.net/11452/56888
dc.identifier.volume25
dc.indexed.scopusScopus
dc.language.isoen
dc.publisherBioMed Central Ltd
dc.relation.journalBMC Pediatrics
dc.rightsinfo:eu-repo/semantics/openAccess
dc.subjectType 1 diabetes
dc.subjectPatient information
dc.subjectLarge Language models
dc.subjectGoogle gemini
dc.subjectChatGPT
dc.subjectArtificial intelligence
dc.titlePerformance of several large language models when answering common patient questions about type 1 diabetes in children: Accuracy, comprehensibility and practicality
dc.typeArticle
dspace.entity.typePublication
local.contributor.departmentTıp Fakültesi/İç Hastalıkları Ana Bilim Dalı
local.indexed.atScopus
relation.isAuthorOfPublication2d1c6521-88a9-4270-9918-92f16f98006c
relation.isAuthorOfPublicationac939042-fc3d-410c-85ac-ec38841d5cad
relation.isAuthorOfPublication7f174e7e-ddd0-45ad-8d41-d37bca5ee9af
relation.isAuthorOfPublication55a91d6f-c2a0-4168-8e68-e7f769195364
relation.isAuthorOfPublication.latestForDiscovery2d1c6521-88a9-4270-9918-92f16f98006c

Dosyalar

Orijinal seri

Şimdi gösteriliyor 1 - 1 / 1
Küçük Resim
Ad:
Ongen_vd_2025.pdf
Boyut:
1.03 MB
Format:
Adobe Portable Document Format