Auditory cortex encodes lipreading information through spatially distributed activity

Autor: Ganesan Karthik, Cody Zhewei Cao, Michael I. Demidenko, Andrew Jahn, William C. Stacey, Vibhangini S. Wasade, David Brang
Rok vydání: 2022
DOI: 10.1101/2022.11.11.516209
Popis: SummaryFace-to-face communication improves the quality and accuracy of heard speech, particularly in noisy environments. Silent lipreading modulates activity in auditory regions, which has been hypothesized to reflect the transformation and encoding of multiple forms of visual speech information used to support hearing processes. Evidence suggests visual timing information as one such signal encoded in auditory areas: seeing when a speaker’s lips come together between words can help listeners parse word-level boundaries. However, it remains unclear how lipreading alters activity in the auditory system to improve speech perception at the single word-level. Using fMRI and intracranial electrodes in patients, here we show that silently lipread words can be classified from neural activity in auditory areas based on distributed spatial information. Lipread words evoked similar representations to the corresponding heard words, consistent with the prediction that automatic lipreading refines the tuning of auditory representations. Similar to heard words, lipread words varied in the distinctiveness of their neural representations in auditory cortex: e.g., the lipread words DIG and GIG evoked more similar neural activity in auditory cortex relative to the more perceptually distinct word FIG, suggesting that lipreading activity reflects probabilistic distributions as opposed to the unique identity of the lipread word. Notably, while visual speech has both excitatory and suppressive effects on auditory firing rates, classification was observed in both neural populations, consistent with the prediction that lipreading contributes to phoneme population tuning by both activating the corresponding representation and suppressing incorrect phonemic representations. These results support a model in which the auditory system combines the joint neural distributions evoked by heard and lipread words to generate a more precise estimate of what was said, particularly during noisy speech.
Databáze: OpenAIRE