The Intersection of aI and language assessment: A study on the reliability of ChatGPT in grading IELTS writing task 2

Koraishi, Osama

Yayın:
The Intersection of aI and language assessment: A study on the reliability of ChatGPT in grading IELTS writing task 2

Tarih

2024-01-01

Kurum Yazarları

Koraishi, Osama

Yazarlar

Koraishi, Osama

Türü

Article

Yayıncı:

European Knowledge Development (EUROKD)

Özet

This study conducts a comprehensive quantitative evaluation of OpenAI's language model, ChatGPT 4, for grading Task 2 writing of the IELTS exam. The objective is to assess the alignment between ChatGPT's grading and that of official human raters. The analysis encompassed a multifaceted approach, including a comparison of means and reliability measures such as Cohen's weighted kappa and intraclass correlation. The results revealed a high agreement in means and substantial reliability between the two grading methods on the level of the majority of texts. However, individual discrepancies and outliers were also identified, underscoring the nuanced nature of grading. While ChatGPT demonstrated efficiency and general alignment with human grading, the study concludes that it should not replace human judgment, particularly due to these observed inconsistencies. The findings contribute valuable insights into the potential and limitations of AI in educational grading and emphasize the importance of a comprehensive quantitative evaluation.

Konusu

Natural Language Processing (NLP), IELTS, ChatGPT, CALL, Artificial Intelligence in Education (AIEd), Artificial Intelligence (AI)

Yayın:
The Intersection of aI and language assessment: A study on the reliability of ChatGPT in grading IELTS writing task 2

Tarih

Akademik Birimler

Kurum Yazarları

Yazarlar

Danışman

Dil

Türü

Yayıncı:

Dergi Başlığı

Dergi ISSN

Cilt Başlığı

Özet

Açıklama

Kaynak:

Anahtar Kelimeler:

Konusu

Alıntı

URI

Koleksiyonlar

Endorsement

Review

Supplemented By

Referenced By

2

Views

0

Downloads

Yayın: The Intersection of aI and language assessment: A study on the reliability of ChatGPT in grading IELTS writing task 2

Tarih

Akademik Birimler

Kurum Yazarları

Yazarlar

Danışman

Dil

Türü

Yayıncı:

Dergi Başlığı

Dergi ISSN

Cilt Başlığı

Özet

Açıklama

Kaynak:

Anahtar Kelimeler:

Konusu

Alıntı

URI

Koleksiyonlar

Endorsement

Review

Supplemented By

Referenced By

2

Views

0

Downloads

Yayın:
The Intersection of aI and language assessment: A study on the reliability of ChatGPT in grading IELTS writing task 2