The Intersection of aI and language assessment: A study on the reliability of ChatGPT in grading IELTS writing task 2

Koraishi, Osama

Yayın:
The Intersection of aI and language assessment: A study on the reliability of ChatGPT in grading IELTS writing task 2

dc.contributor.author	Koraishi, Osama
dc.contributor.buuauthor	Koraishi, Osama
dc.contributor.department	Bursa Uludağ Üniversitesi
dc.contributor.orcid	0009-0008-1670-3436
dc.contributor.scopusid	59317339800
dc.date.accessioned	2025-05-12T22:31:26Z
dc.date.issued	2024-01-01
dc.description.abstract	This study conducts a comprehensive quantitative evaluation of OpenAI's language model, ChatGPT 4, for grading Task 2 writing of the IELTS exam. The objective is to assess the alignment between ChatGPT's grading and that of official human raters. The analysis encompassed a multifaceted approach, including a comparison of means and reliability measures such as Cohen's weighted kappa and intraclass correlation. The results revealed a high agreement in means and substantial reliability between the two grading methods on the level of the majority of texts. However, individual discrepancies and outliers were also identified, underscoring the nuanced nature of grading. While ChatGPT demonstrated efficiency and general alignment with human grading, the study concludes that it should not replace human judgment, particularly due to these observed inconsistencies. The findings contribute valuable insights into the potential and limitations of AI in educational grading and emphasize the importance of a comprehensive quantitative evaluation.
dc.identifier.doi	10.32038/ltrq.2024.43.02
dc.identifier.endpage	42
dc.identifier.issn	26676753
dc.identifier.scopus	2-s2.0-85203337893
dc.identifier.startpage	22
dc.identifier.uri	https://hdl.handle.net/11452/51343
dc.identifier.volume	43
dc.indexed.scopus	Scopus
dc.language.iso	en
dc.publisher	European Knowledge Development (EUROKD)
dc.relation.journal	Language Teaching Research Quarterly
dc.rights	info:eu-repo/semantics/closedAccess
dc.subject	Natural Language Processing (NLP)
dc.subject	IELTS
dc.subject	ChatGPT
dc.subject	CALL
dc.subject	Artificial Intelligence in Education (AIEd)
dc.subject	Artificial Intelligence (AI)
dc.subject.scopus	Testing; Raters; English Language
dc.title	The Intersection of aI and language assessment: A study on the reliability of ChatGPT in grading IELTS writing task 2
dc.type	Article
dspace.entity.type	Publication
local.contributor.department	Bursa Uludağ Üniversitesi
local.indexed.at	Scopus

Koleksiyonlar

İndeksli Yayınlar / Indexed Publications

Yayın: The Intersection of aI and language assessment: A study on the reliability of ChatGPT in grading IELTS writing task 2

Dosyalar

Koleksiyonlar

Yayın:
The Intersection of aI and language assessment: A study on the reliability of ChatGPT in grading IELTS writing task 2