Mel Frequency Spectral Domain Defenses against Adversarial Attacks on Speech Recognition Systems

Autor:	Nicholas Mehlman, Anirudh Sreeram, Raghuveer Peri, Shrikanth Narayanan
Jazyk:	angličtina
Rok vydání:	2022
Předmět:	Pulmonary and Respiratory Medicine Computer Science - Machine Learning Pediatrics Perinatology and Child Health Electrical Engineering and Systems Science - Audio and Speech Processing
Popis:	A variety of recent works have looked into defenses for deep neural networks against adversarial attacks particularly within the image processing domain. Speech processing applications such as automatic speech recognition (ASR) are increasingly relying on deep learning models, and so are also prone to adversarial attacks. However, many of the defenses explored for ASR simply adapt the image-domain defenses, which may not provide optimal robustness. This paper explores speech specific defenses using the mel spectral domain, and introduces a novel defense method called 'mel domain noise flooding' (MDNF). MDNF applies additive noise to the mel spectrogram of a speech utterance prior to re-synthesising the audio signal. We test the defenses against strong white-box adversarial attacks such as projected gradient descent (PGD) and Carlini-Wagner (CW) attacks, and show better robustness compared to a randomized smoothing baseline across strong threat models. Comment: This paper is 5 pages long and was submitted to Interspeech 2022
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_dedup___::a90c30150d393649eb119a1522d5a575 http://arxiv.org/abs/2203.15283 Zobrazit plný text záznamu