Popis: |
Survey measurement scales are expected to be stable – to generate the same values across two timepoints and under unchanged conditions. In scale development, stability is assessed by calculating a scale’s test-retest reliability – a prerequisite to validity. Yet, a systematic review shows that test-retest reliability values are reported for only 23% of newly developed scales and typically assessed only at aggregate level – based on scale-level or subscale-level scores. This study (1) demonstrates how (sub)scale-level test-retest reliability indicators can conceal a lack of response stability at item level and (2) proposes a complementary protocol for assessing item-level response stability. Assessing stability at both item- and scale level ensures that only stable items are included in a scale, which, in turn, increases the reliability and validity of the scale and contributes to the replicability of findings in the social sciences. |