Abstrakt: |
This article describes a study conducted to explore differences in rater severity and consistency among inexperienced and experienced raters both before and after rater training. Sixteen raters (eight experienced and eight inexperienced) rated overlapping subsets of essays from a total sample of 60 essays before and after rater training in the context of an operational administration of UCLA’s English as a Second Language Placement Examination (ESLPE). A three-part scale was used, comprising content, rhetorical control, and language. Ratings were analysed using FACETS, a multi- faceted Rasch analysis program that provides estimates of rater severity on a linear scale as well as fit statistics, which are indicators of rater consistency. The analysis showed that the inexperienced raters tended to be both more severe and less consistent in their ratings than the experienced raters before training. After training, the differences between the two groups of raters were less pronounced; however, significant differences in severity were still found among raters, although consistency had improved for most raters. These results provide support for the notion that rater training is more successful in helping raters give more predictable scores (i.e., intra-rater reliability) than in getting them to give identical scores (i.e., inter-rater reliability). [ABSTRACT FROM AUTHOR] |