Yayın:
The Intersection of aI and language assessment: A study on the reliability of ChatGPT in grading IELTS writing task 2

dc.contributor.authorKoraishi, Osama
dc.contributor.buuauthorKoraishi, Osama
dc.contributor.departmentBursa Uludağ Üniversitesi
dc.contributor.orcid0009-0008-1670-3436
dc.contributor.scopusid59317339800
dc.date.accessioned2025-05-12T22:31:26Z
dc.date.issued2024-01-01
dc.description.abstractThis study conducts a comprehensive quantitative evaluation of OpenAI's language model, ChatGPT 4, for grading Task 2 writing of the IELTS exam. The objective is to assess the alignment between ChatGPT's grading and that of official human raters. The analysis encompassed a multifaceted approach, including a comparison of means and reliability measures such as Cohen's weighted kappa and intraclass correlation. The results revealed a high agreement in means and substantial reliability between the two grading methods on the level of the majority of texts. However, individual discrepancies and outliers were also identified, underscoring the nuanced nature of grading. While ChatGPT demonstrated efficiency and general alignment with human grading, the study concludes that it should not replace human judgment, particularly due to these observed inconsistencies. The findings contribute valuable insights into the potential and limitations of AI in educational grading and emphasize the importance of a comprehensive quantitative evaluation.
dc.identifier.doi10.32038/ltrq.2024.43.02
dc.identifier.endpage42
dc.identifier.issn26676753
dc.identifier.scopus2-s2.0-85203337893
dc.identifier.startpage22
dc.identifier.urihttps://hdl.handle.net/11452/51343
dc.identifier.volume43
dc.indexed.scopusScopus
dc.language.isoen
dc.publisherEuropean Knowledge Development (EUROKD)
dc.relation.journalLanguage Teaching Research Quarterly
dc.rightsinfo:eu-repo/semantics/closedAccess
dc.subjectNatural Language Processing (NLP)
dc.subjectIELTS
dc.subjectChatGPT
dc.subjectCALL
dc.subjectArtificial Intelligence in Education (AIEd)
dc.subjectArtificial Intelligence (AI)
dc.subject.scopusTesting; Raters; English Language
dc.titleThe Intersection of aI and language assessment: A study on the reliability of ChatGPT in grading IELTS writing task 2
dc.typeArticle
dspace.entity.typePublication
local.contributor.departmentBursa Uludağ Üniversitesi
local.indexed.atScopus

Dosyalar