An Efficient Time-Domain End-to-End Single-Channel Bird Sound Separation Network

Autor: Chengyun Zhang, Yonghuan Chen, Zezhou Hao, Xinghui Gao
Jazyk: angličtina
Rok vydání: 2022
Předmět:
Zdroj: Animals, Vol 12, Iss 22, p 3117 (2022)
Druh dokumentu: article
ISSN: 2076-2615
DOI: 10.3390/ani12223117
Popis: Bird sounds have obvious characteristics per species, and they are an important way for birds to communicate and transmit information. However, the recorded bird sounds in the field are usually mixed, which making it challenging to identify different bird species and to perform associated tasks. In this study, based on the supervised learning framework, we propose a bird sound separation network, a dual-path tiny transformer network, to directly perform end-to-end mixed species bird sound separation in the time-domain. This separation network is mainly composed of the dual-path network and the simplified transformer structure, which greatly reduces the computational resources required of the network. Experimental results show that our proposed separation network has good separation performance (SI-SNRi reaches 19.3 dB and SDRi reaches 20.1 dB), but compared with DPRNN and DPTNet, its parameters and floating point operations are greatly reduced, which means a higher separation efficiency and faster separation speed. The good separation performance and high separation efficiency indicate that our proposed separation network is valuable for distinguishing individual birds and studying the interaction between individual birds, as well as for realizing the automatic identification of bird species on a variety of mobile devices or edge computing devices.
Databáze: Directory of Open Access Journals
Nepřihlášeným uživatelům se plný text nezobrazuje