TEACHERS or CHATGPT: The ISSUE of ACCURACY and CONSISTENCY in L2 ASSESSMENT.

Autor: Shabara, Ramy, ElEbyary, Khaled, Boraie, Deena
Předmět:
Zdroj: Teaching English with Technology; 2024, Vol. 24 Issue 2, p71-92, 22p
Abstrakt: Although there are claims that ChatGPT, an AI-based language model, is capable of assessing the writing of L2 learners accurately and consistently in the classroom, a number of recent studies have shown discrepancies between AI and human raters. Furthermore, there is a lack of studies investigating the intrareliability of ChatGPT scores. Accordingly, this study aimed to examine the accuracy and consistency of ChatGPT compared to teachers, as well as with itself, after being trained on a rubric. To accomplish this goal, the study adopted a quantitative correlational non-experimental design. A dataset of 100 writing assignments, submitted by a cohort of B1-level students at an international branch university in Egypt, was analyzed quantitatively. These assignments were initially evaluated and moderated by trained teachers (n=11), and subsequently, the same assignments were also assessed twice by ChatGPT. The findings indicated that teachers' scores exhibited a higher level of accuracy compared to those generated by ChatGPT. The results also revealed that ChatGPT exhibits a moderate, yet questioned, level of intra-rater reliability. The weak-to-moderate correlations between ChatGPT and teacher scores raise concerns about the accuracy and consistency of ChatGPT's scoring of writing assignments. The implications of the findings highlight the potential applications and limitations of ChatGPT in L2 writing assessment. This study contributes to the ongoing discourse on the use of AI technologies in language education and provides insights into the accuracy and reliability of ChatGPT as an evaluation tool for L2 writing. [ABSTRACT FROM AUTHOR]
Databáze: Complementary Index