Using confidence intervals to determine adequate item sample sizes for vocabulary tests: An essential but overlooked practice

Autor:	Jeffrey Stewart, Henrik Gyllstad, Stuart McLean
Rok vydání:	2020
Předmět:	050101 languages & linguistics Linguistics and Language Vocabulary media_common.quotation_subject 05 social sciences 050301 education Bootstrapping (linguistics) Language and Linguistics Confidence interval Vocabulary tests Word lists by frequency Sample size determination Statistics 0501 psychology and cognitive sciences Psychology 0503 education Social Sciences (miscellaneous) media_common
Zdroj:	Language Testing. 38:558-579
ISSN:	1477-0946 0265-5322
Popis:	The last three decades have seen an increase of tests aimed at measuring an individual’s vocabulary level or size. The target words used in these tests are typically sampled from word frequency lists, which are in turn based on language corpora. Conventionally, test developers sample items from frequency bands of 1000 words; different tests employ different sampling ratios. Some have as few as 5 or 10 items representing the underlying population of words, whereas other tests feature a larger number of items, such as 24, 30, or 40. However, very rarely are the sampling size choices supported by clear empirical evidence. Here, using a bootstrapping approach, we illustrate the effect that a sample-size increase has on confidence intervals of individual learner vocabulary knowledge estimates, and on the inferences that can safely be made from test scores. We draw on a unique dataset consisting of adult L1 Japanese test takers’ performance on two English vocabulary test formats, each featuring 1000 words. Our analysis shows that there are few purposes and settings where as few as 5 to 10 sampled items from a 1000-word frequency band (1K) are sufficient. The use of 30 or more items per 1000-word frequency band and tests consisting of fewer bands is recommended.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_________::9405a0ea37ac4d3bc05252eaafbb68cd https://doi.org/10.1177/0265532220979562 Zobrazit plný text záznamu