L3DAS23: Learning 3D Audio Sources for Audio-Visual Extended Reality

Autor:	Riccardo F. Gramaccioni, Christian Marinoni, Changan Chen, Aurelio Uncini, Danilo Comminiello
Jazyk:	angličtina
Rok vydání:	2024
Předmět:	3D audio ambisonics data challenge sound event localization and detection speech enhancement Electrical engineering. Electronics. Nuclear engineering TK1-9971
Zdroj:	IEEE Open Journal of Signal Processing, Vol 5, Pp 632-640 (2024)
Druh dokumentu:	article
ISSN:	2644-1322
DOI:	10.1109/OJSP.2024.3376297
Popis:	The primary goal of the L3DAS (Learning 3D Audio Sources) project is to stimulate and support collaborative research studies concerning machine learning techniques applied to 3D audio signal processing. To this end, the L3DAS23 Challenge, presented at IEEE ICASSP 2023, focuses on two spatial audio tasks of paramount interest for practical uses: 3D speech enhancement (3DSE) and 3D sound event localization and detection (3DSELD). Both tasks are evaluated within augmented reality applications. The aim of this paper is to describe the main results obtained from this challenge. We provide the L3DAS23 dataset, which comprises a collection of first-order Ambisonics recordings in reverberant simulated environments. Indeed, we maintain some general characteristics of the previous L3DAS challenges, featuring a pair of first-order Ambisonics microphones to capture the audio signals and involving multiple-source and multiple-perspective Ambisonics recordings. However, in this new edition, we introduce audio-visual scenarios by including images that depict the frontal view of the environments as captured from the perspective of the microphones. This addition aims to enrich the challenge experience, giving participants tools for exploring a combination of audio and images for solving the 3DSE and 3DSELD tasks. In addition to a brand-new dataset, we provide updated baseline models designed to take advantage of audio-image pairs. To ensure accessibility and reproducibility, we also supply supporting API for an effortless replication of our results. Lastly, we present the results achieved by the participants of the L3DAS23 Challenge.
Databáze:	Directory of Open Access Journals
Externí odkaz:	https://doaj.org/article/4e4fb4f153b042ab908aa7ae8122c424 Zobrazit plný text záznamu View record in DOAJ