A thorough evaluation of the Language Environment Analysis (LENATM) system

Autor:	Alejandrina Cristia, Marvin Lavechin, Camila Scaff, Melanie Soderstrom, Caroline F Rowland, Okko Räsänen, John P Bunce, Elika Bergelson
Přispěvatelé:	Laboratoire de sciences cognitives et psycholinguistique (LSCP), Département d'Etudes Cognitives - ENS Paris (DEC), École normale supérieure - Paris (ENS Paris), Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL)-École normale supérieure - Paris (ENS Paris), Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL)-École des hautes études en sciences sociales (EHESS)-Centre National de la Recherche Scientifique (CNRS), University of Manitoba [Winnipeg], Aalto University, Max Planck Institute for Evolutionary Anthropology, Duke University [Durham]
Jazyk:	angličtina
Rok vydání:	2020
Předmět:	03 medical and health sciences 0302 clinical medicine 4. Education 030220 oncology & carcinogenesis 05 social sciences 0501 psychology and cognitive sciences 050105 experimental psychology 030217 neurology & neurosurgery [INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL]
Zdroj:	Behavior Research Methods Behavior Research Methods, Psychonomic Society, Inc, 2020, ⟨10.31219/osf.io/mxr8s⟩
ISSN:	1554-351X 1554-3528
DOI:	10.31219/osf.io/mxr8s⟩
Popis:	International audience; In the previous decade, dozens of studies involving thousands of children across several research disciplines have made use of a combined daylong audio-recorder and automated algorithmic analysis called the LENA^®^ system, which aims to assess children's language environment. While the system's prevalence in the language acquisition domain is steadily growing, there are only scattered validation efforts, on only some of its key characteristics. Here, we assess the LENA^®^ system's accuracy across all of its key measures: speaker classification, Child Vocalization Counts (CVC), Conversational Turn Counts (CTC), and Adult Word Counts (AWC). Our assessment is based on manual annotation of clips that have been randomly or periodically sampled out of daylong recordings, collected from (a) populations similar to the system's original training data (North American English-learning children aged 3-36 months), (b) children learning another dialect of English (UK), and (c) slightly older children growing up in a different linguistic and socio-cultural setting (Tsimane' learners in rural Bolivia). We find reasonably high accuracy in some measures (AWC, CVC), with more problematic levels of performance in others (CTC, precision of male adults and other children). Statistical analyses do not support the view that performance is worse for children who are dissimilar from the LENA^®^ original training set. Whether LENA^®^ results are accurate enough for a given research, educational, or clinical application depends largely on the specifics at hand. We therefore conclude with a set of recommendations to help researchers make this determination for their goals.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_dedup___::10129a39c726c16c240a0332ca41bf6b https://hal.archives-ouvertes.fr/hal-02989519 Zobrazit plný text záznamu Full text from SpringerLink