Outdoor Acoustic Event Identification with DNN Using a Quadrotor-Embedded Microphone Array
Autor: | Akihide Nagamine, Keisuke Nakamura, Osamu Sugiyama, Satoshi Uemura, Kazuhiro Nakadai, Ryosuke Kojima |
---|---|
Rok vydání: | 2017 |
Předmět: |
0209 industrial biotechnology
Engineering Microphone array General Computer Science Event (computing) business.industry Speech recognition 02 engineering and technology 030507 speech-language pathology & audiology 03 medical and health sciences Identification (information) 020901 industrial engineering & automation Electrical and Electronic Engineering 0305 other medical science business |
Zdroj: | Journal of Robotics and Mechatronics. 29:188-197 |
ISSN: | 1883-8049 0915-3942 |
Popis: | [abstFig src='/00290001/18.jpg' width='275' text='Software architecture for OCASA with proposed AEI' ] This paper addressesAcoustic Event Identification (AEI)of acoustic signals observed with a microphone array embedded in a quadrotor that is flying in a noisy outdoor environment. In such an environment, noise generated by rotors, wind, and other sound sources is a big problem. To solve this, we propose the use of a combination of two approaches that have recently been introduced:Sound Source Separation (SSS)andSound Source Identification (SSI). SSS improves theSignal-to-Noise Ratio (SNR)of the input sound, and SSI is then performed on the SNR-improved sound. Two SSS methods are investigated. One is a single channel algorithm,Robust Principal Component Analysis (RPCA), and the other isGeometric High-order Decorrelation-based Source Separation (GHDSS-AS), known as a multichannel method. For SSI, we investigate two types of deep neural networks namelyStacked denoising Autoencoder (SdA)andConvolutional Neural Network (CNN), which have been extensively studied as highly-performant approaches in the fields of automatic speech recognition and visual object recognition. Preliminary experiments have showed the effectiveness of the proposed approaches, a combination of GHDSS-AS and CNN in particular. This combination correctly identified over 80% of sounds in an 8-class sound classification recorded by a hovering quadrotor. In addition, the CNN identifier that was implemented could be handled even with a low-end CPU by measuring the prediction time. |
Databáze: | OpenAIRE |
Externí odkaz: |