Does Infant-Directed Speech Help Phonetic Learning? A Machine Learning Investigation

Autor: Reiko Mazuka, Emmanuel Dupoux, Bogdan Ludusan
Přispěvatelé: RIKEN Center for Brain Science [Wako] (RIKEN CBS), RIKEN - Institute of Physical and Chemical Research [Japon] (RIKEN), Universität Bielefeld = Bielefeld University, Department of Psychology and Neuroscience, Duke University [Durham], Apprentissage machine et développement cognitif (CoML), Inria de Paris, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Laboratoire de sciences cognitives et psycholinguistique (LSCP), Département d'Etudes Cognitives - ENS Paris (DEC), École normale supérieure - Paris (ENS Paris), Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL)-École normale supérieure - Paris (ENS Paris), Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL)-École des hautes études en sciences sociales (EHESS)-Centre National de la Recherche Scientifique (CNRS)-Département d'Etudes Cognitives - ENS Paris (DEC), Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL)-École des hautes études en sciences sociales (EHESS)-Centre National de la Recherche Scientifique (CNRS), Laboratoire de sciences cognitives et psycholinguistique (LSCP), CIFAR program in Learning in Machines & Brains CIFAR LMB program, The research reported in this paper was partly funded by JSPS Grant-in-Aid for Scientific Research (16H06319, 20H05617) and MEXT Grant-in-Aid on Innovative Areas #4903 (Co-creative Language Evolution), 17H06382 to R. Mazuka. The work of E. Dupoux in his EHESS role was supported by the European Research Council (ERC-2011-AdG-295810 BOOTPHON) the Agence Nationale pour la Recherche (ANR-10-LABX-0087 IEC, ANR-10-IDEX0001-02 PSL*, ANR-19-P3IA-0001 PRAIRIE 3IA Institute), and CIFAR (Learning in Machines and Brain). Part of the work was conducted while E. Dupoux was a visiting scientist at DeepMind and Facebook. B. Ludusan was also supported by the European Union’s Horizon 2020 research and innovation programme under the Marie Sklodowska-Curie grant agreement no. 799022., ANR-19-P3IA-0001,PRAIRIE,PaRis Artificial Intelligence Research InstitutE(2019), ANR-17-EURE-0017,FrontCog,Frontières en cognition(2017), ANR-10-IDEX-0001,PSL,Paris Sciences et Lettres(2010), European Project: 295810,EC:FP7:ERC,ERC-2011-ADG_20110406,BOOTPHON(2012), École normale supérieure - Paris (ENS-PSL), Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL)-École normale supérieure - Paris (ENS-PSL), Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL)-École des hautes études en sciences sociales (EHESS)-Centre National de la Recherche Scientifique (CNRS)-Inria de Paris, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)
Rok vydání: 2020
Předmět:
Zdroj: Cognitive Science
Cognitive Science, Wiley, 2021, 45 (5), ⟨10.1111/cogs.12946⟩
Cognitive Science, 2021, 45 (5), ⟨10.1111/cogs.12946⟩
ISSN: 1551-6709
0364-0213
DOI: 10.1111/cogs.12946⟩
Popis: A prominent hypothesis holds that by speaking to infants in infant-directed speech (IDS) as opposed to adult-directed speech (ADS), parents help them learn phonetic categories. Specifically, two characteristics of IDS have been claimed to facilitate learning: hyperarticulation, which makes the categories more separable, and variability, which makes the generalization more robust. Here, we test the separability and robustness of vowel category learning on acoustic representations of speech uttered by Japanese adults in ADS, IDS (addressed to 18- to 24-month olds), or read speech (RS). Separability is determined by means of a distance measure computed between the five short vowel categories of Japanese, while robustness is assessed by testing the ability of six different machine learning algorithms trained to classify vowels to generalize on stimuli spoken by a novel speaker in ADS. Using two different speech representations, we find that hyperarticulated speech, in the case of RS, can yield better separability, and that increased between-speaker variability in ADS can yield, for some algorithms, more robust categories. However, these conclusions do not apply to IDS, which turned out to yield neither more separable nor more robust categories compared to ADS inputs. We discuss the usefulness of machine learning algorithms run on real data to test hypotheses about the functional role of IDS. © 2021 The Authors. Cognitive Science published by Wiley Periodicals LLC on behalf of Cognitive Science Society (CSS).
Databáze: OpenAIRE