Cortical Networks for Recognition of Speech with Simultaneous Talkers

Autor: Christian Herrera Ortiz, Nicole Whittle, Marjorie R. Leek, Christian Brodbeck, Grace Lee, Caleb Barcenas, Samuel Barnes, Barbara Holshouser, Alex C. Yi, Jonathan Henry Venezia
Rok vydání: 2021
DOI: 10.31234/osf.io/vea5y
Popis: The relative contributions of superior temporal (auditory) vs. inferior frontal and parietal (sensorimotor) networks to recognition of speech against competing speech remain unclear, although the contributions themselves are well established. Here, we use fMRI with spectrotemporal receptive field (STRF) modeling to examine the speech information represented in temporal vs. frontoparietal networks for two speech recognition tasks with and without a competing talker. We also generate ‘neurometric functions’ that describe the relative contributions of these networks to speech recognition performance. Specifically, 25 listeners completed two versions of a 3-Alternative Forced-Choice (3-AFC) competing speech task: “Unison” and “Competing”, in which a female (target) and a male (competing) talker uttered identical or different phrases, respectively. Spectrotemporal modulation filtering was applied to the two-talker mixtures and a “boosting” procedure was used to generate STRF models to predict brain activation from differences in spectrotemporal distortion on each trial. STRF model predictive accuracy was better for Competing than Unison in a bilateral temporal lobe network, and better for Unison than Competing in a large network of frontoparietal and midline brain regions. Agglomerative STRF clustering further revealed three subnetworks: a bilateral superior temporal Intelligibility network, a frontoparietal Distortion network, and a Semantic network distributed across classic semantic memory regions. The Intelligibility and Semantic networks responded primarily to spectrotemporal cues associated with speech intelligibility, regardless of condition, while the Distortion network responded to the absence of such cues in both conditions, but also to the absence (presence) of target-talker (competing-talker) vocal pitch in the Competing condition, suggesting a generalized response to signal degradation. Neurometric function analysis showed that: (i) activation in the Intelligibility network was strongly positively correlated with behavioral performance and that this relation was entirely STRF-mediated; and (ii) activation in the Distortion network was strongly negatively correlated with performance and this relation was only partially STRF-mediated. The contributions to performance from these networks were partially independent and of roughly equal magnitude. Finally, activation in the Semantic network was weakly positively correlated with performance and this relation was entirely superseded by those in the Intelligibility and Distortion networks. We conclude: (a) superior temporal regions play a bottom-up, perceptual role in competing speech tasks; (b) frontoparietal regions play a top-down, task-dependent role in competing speech tasks that scales with listening effort; and (c) performance ultimately relies on dynamic interactions between these networks, with additional contributions from semantic regions that likely scale with the semantic predictability of the speech material.
Databáze: OpenAIRE