Autor: |
Finch WH; Holmes Finch, Department of Educational Psychology, Ball State University, Muncie, IN 47306, USA, whfinch@bsu.edu., Finch MH, French BF, McIntosh DE, Moss L |
Jazyk: |
angličtina |
Zdroj: |
Journal of applied measurement [J Appl Meas] 2018; Vol. 19 (1), pp. 26-40. |
Abstrakt: |
An important aspect of the educational and psychological evaluation of individuals is the selection of scales with appropriate evidence of reliability and validity for inferences and uses of the scores for the population of interest. One key aspect of validity is the degree to which a scale fairly assesses the construct(s) of interest for members of different subgroups within the population. Typically, this issue is addressed statistically through assessment of differential item functioning (DIF) of individual items, or differential test functioning (DTF) of sets of items within the same measure. When selecting an assessment to use for a given application (e.g., measuring intelligence), or which form of an assessment to use for a test administration, researchers need to consider the extent to which the scales work with all members of the population. Little research has examined methods for comparing the amount or magnitude of DIF/DTF present in two or more assessments when deciding which assessment to use. The current study made use of 7 different statistics for this purpose, in the context of intelligence testing. Results demonstrate that by using a variety of effect sizes, the researcher can gain insights into not only which scales may contain the least amount of DTF, but also how they differ with regard to the way in which the DTF manifests itself. |
Databáze: |
MEDLINE |
Externí odkaz: |
|