SRIB-LEAP submission to Far-field Multi-Channel Speech Enhancement Challenge for Video Conferencing

Autor:	Rohit Kumar, R. G. Prithvi Raj, Anurenjan Purushothaman, M. K. Jayesh, M. A. Basha Shaik, Sriram Ganapathy
Rok vydání:	2021
Předmět:	Beamforming Signal Processing (eess.SP) FOS: Computer and information sciences Sound (cs.SD) Channel (digital image) Computer science Microphone Speech recognition Mean opinion score Image and Video Processing (eess.IV) Electrical Engineering and Systems Science - Image and Video Processing computer.software_genre Convolutional neural network Computer Science - Sound Speech enhancement Videoconferencing Audio and Speech Processing (eess.AS) FOS: Electrical engineering electronic engineering information engineering Electrical Engineering and Systems Science - Signal Processing computer PESQ Electrical Engineering and Systems Science - Audio and Speech Processing
DOI:	10.48550/arxiv.2106.12763
Popis:	This paper presents the details of the SRIB-LEAP submission to the ConferencingSpeech challenge 2021. The challenge involved the task of multi-channel speech enhancement to improve the quality of far field speech from microphone arrays in a video conferencing room. We propose a two stage method involving a beamformer followed by single channel enhancement. For the beamformer, we incorporated self-attention mechanism as inter-channel processing layer in the filter-and-sum network (FaSNet), an end-to-end time-domain beamforming system. The single channel speech enhancement is done in log spectral domain using convolution neural network (CNN)-long short term memory (LSTM) based architecture. We achieved improvements in objective quality metrics - perceptual evaluation of speech quality (PESQ) of 0.5 on the noisy data. On subjective quality evaluation, the proposed approach improved the mean opinion score (MOS) by an absolute measure of 0.9 over the noisy audio.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_dedup___::48d58b365b709f17187e9e7ec2b91676 Zobrazit plný text záznamu