Zobrazeno 1 - 10
of 284
pro vyhledávání: '"Ellis, Daniel P W"'
Autor:
Moore, R. Channing, Ellis, Daniel P. W., Fonseca, Eduardo, Hershey, Shawn, Jansen, Aren, Plakal, Manoj
Publikováno v:
ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Rhodes Island, Greece, 2023, pp. 1-5
Machine learning from training data with a skewed distribution of examples per class can lead to models that favor performance on common classes at the expense of performance on rare ones. AudioSet has a very wide range of priors over its 527 sound e
Externí odkaz:
http://arxiv.org/abs/2307.00079
Autor:
Ronchini, Francesca, Cornell, Samuele, Serizel, Romain, Turpault, Nicolas, Fonseca, Eduardo, Ellis, Daniel P. W.
Publikováno v:
Proceedings of the 7th Detection and Classification of Acoustic Scenes and Events 2022 Workshop (DCASE2022)
The aim of the Detection and Classification of Acoustic Scenes and Events Challenge Task 4 is to evaluate systems for the detection of sound events in domestic environments using an heterogeneous dataset. The systems need to be able to correctly dete
Externí odkaz:
http://arxiv.org/abs/2210.07856
Autor:
Huang, Qingqing, Jansen, Aren, Lee, Joonseok, Ganti, Ravi, Li, Judith Yue, Ellis, Daniel P. W.
Music tagging and content-based retrieval systems have traditionally been constructed using pre-defined ontologies covering a rigid set of music attributes or text queries. This paper presents MuLan: a first attempt at a new generation of acoustic mo
Externí odkaz:
http://arxiv.org/abs/2208.12415
Autor:
Hershey, Shawn, Ellis, Daniel P W, Fonseca, Eduardo, Jansen, Aren, Liu, Caroline, Moore, R Channing, Plakal, Manoj
To reveal the importance of temporal precision in ground truth audio event labels, we collected precise (~0.1 sec resolution) "strong" labels for a portion of the AudioSet dataset. We devised a temporally strong evaluation set (including explicit neg
Externí odkaz:
http://arxiv.org/abs/2105.07031
Autor:
Fonseca, Eduardo, Jansen, Aren, Ellis, Daniel P. W., Wisdom, Scott, Tagliasacchi, Marco, Hershey, John R., Plakal, Manoj, Hershey, Shawn, Moore, R. Channing, Serra, Xavier
Real-world sound scenes consist of time-varying collections of sound sources, each generating characteristic sound events that are mixed together in audio recordings. The association of these constituent sound events with their mixture and each other
Externí odkaz:
http://arxiv.org/abs/2105.02132
Autor:
Tzinis, Efthymios, Wisdom, Scott, Jansen, Aren, Hershey, Shawn, Remez, Tal, Ellis, Daniel P. W., Hershey, John R.
Recent progress in deep learning has enabled many advances in sound separation and visual scene understanding. However, extracting sound sources which are apparent in natural videos remains an open problem. In this work, we present AudioScope, a nove
Externí odkaz:
http://arxiv.org/abs/2011.01143
Autor:
Fonseca, Eduardo, Hershey, Shawn, Plakal, Manoj, Ellis, Daniel P. W., Jansen, Aren, Moore, R. Channing, Serra, Xavier
Publikováno v:
IEEE Signal Processing Letters, Vol. 27, 2020, pages 1235-1239
The study of label noise in sound event recognition has recently gained attention with the advent of larger and noisier datasets. This work addresses the problem of missing labels, one of the big weaknesses of large audio datasets, and one of the mos
Externí odkaz:
http://arxiv.org/abs/2005.00878
Publikováno v:
ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Deep learning approaches have recently achieved impressive performance on both audio source separation and sound classification. Most audio source separation approaches focus only on separating sources belonging to a restricted domain of source class
Externí odkaz:
http://arxiv.org/abs/1911.07951
Autor:
Jansen, Aren, Ellis, Daniel P. W., Hershey, Shawn, Moore, R. Channing, Plakal, Manoj, Popat, Ashok C., Saurous, Rif A.
Humans do not acquire perceptual abilities in the way we train machines. While machine learning algorithms typically operate on large collections of randomly-chosen, explicitly-labeled examples, human acquisition relies more heavily on multimodal uns
Externí odkaz:
http://arxiv.org/abs/1911.05894
This paper introduces Task 2 of the DCASE2019 Challenge, titled "Audio tagging with noisy labels and minimal supervision". This task was hosted on the Kaggle platform as "Freesound Audio Tagging 2019". The task evaluates systems for multi-label audio
Externí odkaz:
http://arxiv.org/abs/1906.02975