ConvConcatNet: a deep convolutional neural network to reconstruct mel spectrogram from the EEG

Autor: Xu, Xiran, Wang, Bo, Yan, Yujie, Zhu, Haolin, Zhang, Zechen, Wu, Xihong, Chen, Jing
Rok vydání: 2024
Předmět:
Zdroj: 2024 IEEE International Conference on Acoustics, Speech, and Signal Processing Workshops (ICASSPW)
Druh dokumentu: Working Paper
DOI: 10.1109/ICASSPW62465.2024.10626859
Popis: To investigate the processing of speech in the brain, simple linear models are commonly used to establish a relationship between brain signals and speech features. However, these linear models are ill-equipped to model a highly dynamic and complex non-linear system like the brain. Although non-linear methods with neural networks have been developed recently, reconstructing unseen stimuli from unseen subjects' EEG is still a highly challenging task. This work presents a novel method, ConvConcatNet, to reconstruct mel-specgrams from EEG, in which the deep convolution neural network and extensive concatenation operation were combined. With our ConvConcatNet model, the Pearson correlation between the reconstructed and the target mel-spectrogram can achieve 0.0420, which was ranked as No.1 in the Task 2 of the Auditory EEG Challenge. The codes and models to implement our work will be available on Github: https://github.com/xuxiran/ConvConcatNet
Comment: 2 pages, 1 figure, 2 tables
Databáze: arXiv