Anomalous Sound Detection Using Deep Audio Representation and a BLSTM Network for Audio Surveillance of Roads
Autor: | Yuhan Zhang, Xianku Li, Wang Wucheng, Yanxiong Li, Mingle Liu |
---|---|
Jazyk: | angličtina |
Rok vydání: | 2018 |
Předmět: |
General Computer Science
Computer science Feature extraction accident detection audio surveillance 02 engineering and technology Discriminative model 0502 economics and business 0202 electrical engineering electronic engineering information engineering General Materials Science 050210 logistics & transportation business.industry 05 social sciences Detector Sound detection General Engineering Pattern recognition Deep audio representation Autoencoder Support vector machine bidirectional long short-term memory network 020201 artificial intelligence & image processing Artificial intelligence Mel-frequency cepstrum lcsh:Electrical engineering. Electronics. Nuclear engineering business Classifier (UML) lcsh:TK1-9971 |
Zdroj: | IEEE Access, Vol 6, Pp 58043-58055 (2018) |
ISSN: | 2169-3536 |
Popis: | Surveillance systems based on image analysis can automatically detect road accidents to ensure a quick intervention by rescue teams. However, in some situations, the visual information is insufficiently reliable, whereas the use of a sound detector can greatly improve the overall reliability of the surveillance system. In this paper, we focus on detecting two classes of anomalous sounds for audio surveillance on roads, i.e., tire skidding and car crash, whose occurrences are an evidently acoustic indication of road accidents or disruptions. In the proposed method, we extract a feature of deep audio representation (DAR) and then use a classifier of a bidirectional long short-term memory network to determine the class of the sound to which each test audio segment belongs. We propose a framework based on multiple-stage deep autoencoder network (DAN) to extract the DAR, which fuses complementary information from several input features and thus can be more discriminative and robust than those input features. In the experiments, we discuss the influences of the parameter settings of the DAN’s hidden layers on the performance of DAR and compare the DAR with other features. Furthermore, the proposed method is compared to the state-of-the-art methods. In evaluating the data with various signal-to-noise ratios, the results show that the DAR outperforms other features, and the proposed method is superior to the state-of-the-art methods for detecting anomalous sounds on roads. |
Databáze: | OpenAIRE |
Externí odkaz: |