AeGAN: Time-Frequency Speech Denoising via Generative Adversarial Networks
Autor: | Abdulatif, Sherif, Armanious, Karim, Guirguis, Karim, Sajeev, Jayasankar T., Yang, Bin |
---|---|
Rok vydání: | 2019 |
Předmět: | |
Druh dokumentu: | Working Paper |
DOI: | 10.23919/Eusipco47968.2020.9287606 |
Popis: | Automatic speech recognition (ASR) systems are of vital importance nowadays in commonplace tasks such as speech-to-text processing and language translation. This created the need for an ASR system that can operate in realistic crowded environments. Thus, speech enhancement is a valuable building block in ASR systems and other applications such as hearing aids, smartphones and teleconferencing systems. In this paper, a generative adversarial network (GAN) based framework is investigated for the task of speech enhancement, more specifically speech denoising of audio tracks. A new architecture based on CasNet generator and an additional feature-based loss are incorporated to get realistically denoised speech phonetics. Finally, the proposed framework is shown to outperform other learning and traditional model-based speech enhancement approaches. Comment: 5 pages, 4 figures and 2 Tables. Accepted in EUSIPCO 2020 |
Databáze: | arXiv |
Externí odkaz: |