Phonetic acquisition in cortical dynamics, a computational approach.

Autor: Dematties D; Universidad de Buenos Aires, Facultad de Ingeniería, Instituto de Ingeniería Biomédica, Ciudad Autónoma de Buenos Aires, Argentina., Rizzi S; Argonne National Laboratory, Lemont, Illinois, United States of America., Thiruvathukal GK; Argonne National Laboratory, Lemont, Illinois, United States of America.; Computer Science Department, Loyola University Chicago, Chicago, Illinois, United States of America., Wainselboim A; Instituto de Ciencias Humanas, Sociales y Ambientales, Centro Científico Tecnológico-CONICET, Ciudad de Mendoza, Mendoza, Argentina., Zanutto BS; Universidad de Buenos Aires, Facultad de Ingeniería, Instituto de Ingeniería Biomédica, Ciudad Autónoma de Buenos Aires, Argentina.; Instituto de Biología y Medicina Experimental-CONICET, Ciudad Autónoma de Buenos Aires, Argentina.
Jazyk: angličtina
Zdroj: PloS one [PLoS One] 2019 Jun 07; Vol. 14 (6), pp. e0217966. Date of Electronic Publication: 2019 Jun 07 (Print Publication: 2019).
DOI: 10.1371/journal.pone.0217966
Abstrakt: Many computational theories have been developed to improve artificial phonetic classification performance from linguistic auditory streams. However, less attention has been given to psycholinguistic data and neurophysiological features recently found in cortical tissue. We focus on a context in which basic linguistic units-such as phonemes-are extracted and robustly classified by humans and other animals from complex acoustic streams in speech data. We are especially motivated by the fact that 8-month-old human infants can accomplish segmentation of words from fluent audio streams based exclusively on the statistical relationships between neighboring speech sounds without any kind of supervision. In this paper, we introduce a biologically inspired and fully unsupervised neurocomputational approach that incorporates key neurophysiological and anatomical cortical properties, including columnar organization, spontaneous micro-columnar formation, adaptation to contextual activations and Sparse Distributed Representations (SDRs) produced by means of partial N-Methyl-D-aspartic acid (NMDA) depolarization. Its feature abstraction capabilities show promising phonetic invariance and generalization attributes. Our model improves the performance of a Support Vector Machine (SVM) classifier for monosyllabic, disyllabic and trisyllabic word classification tasks in the presence of environmental disturbances such as white noise, reverberation, and pitch and voice variations. Furthermore, our approach emphasizes potential self-organizing cortical principles achieving improvement without any kind of optimization guidance which could minimize hypothetical loss functions by means of-for example-backpropagation. Thus, our computational model outperforms multiresolution spectro-temporal auditory feature representations using only the statistical sequential structure immerse in the phonotactic rules of the input stream.
Competing Interests: The authors have declared that no competing interests exist.
Databáze: MEDLINE
Nepřihlášeným uživatelům se plný text nezobrazuje