Zobrazeno 1 - 10
of 3 800
pro vyhledávání: '"Primus, A."'
Autor:
Schmid, Florian, Morocutti, Tobias, Foscarin, Francesco, Schlüter, Jan, Primus, Paul, Widmer, Gerhard
We propose a pre-training pipeline for audio spectrogram transformers for frame-level sound event detection tasks. On top of common pre-training steps, we add a meticulously designed training routine on AudioSet frame-level annotations. This includes
Externí odkaz:
http://arxiv.org/abs/2409.09546
Dual-encoder-based audio retrieval systems are commonly optimized with contrastive learning on a set of matching and mismatching audio-caption pairs. This leads to a shared embedding space in which corresponding items from the two modalities end up c
Externí odkaz:
http://arxiv.org/abs/2408.11641
Query-by-Vocal Imitation (QBV) is about searching audio files within databases using vocal imitations created by the user's voice. Since most humans can effectively communicate sound concepts through voice, QBV offers the more intuitive and convenien
Externí odkaz:
http://arxiv.org/abs/2408.11638
This technical report describes the CP-JKU team's submission for Task 4 Sound Event Detection with Heterogeneous Training Datasets and Potentially Missing Labels of the DCASE 24 Challenge. We fine-tune three large Audio Spectrogram Transformers, PaSS
Externí odkaz:
http://arxiv.org/abs/2408.00791
A central problem in building effective sound event detection systems is the lack of high-quality, strongly annotated sound event datasets. For this reason, Task 4 of the DCASE 2024 challenge proposes learning from two heterogeneous datasets, includi
Externí odkaz:
http://arxiv.org/abs/2407.12997
Autor:
Primus, Paul, Widmer, Gerhard
Matching raw audio signals with textual descriptions requires understanding the audio's content and the description's semantics and then drawing connections between the two modalities. This paper investigates a hybrid retrieval system that utilizes a
Externí odkaz:
http://arxiv.org/abs/2406.15897
Autor:
Schmid, Florian, Primus, Paul, Heittola, Toni, Mesaros, Annamaria, Martín-Morató, Irene, Koutini, Khaled, Widmer, Gerhard
This article describes the Data-Efficient Low-Complexity Acoustic Scene Classification Task in the DCASE 2024 Challenge and the corresponding baseline system. The task setup is a continuation of previous editions (2022 and 2023), which focused on rec
Externí odkaz:
http://arxiv.org/abs/2405.10018
This work presents a text-to-audio-retrieval system based on pre-trained text and spectrogram transformers. Our method projects recordings and textual descriptions into a shared audio-caption space in which related examples from different modalities
Externí odkaz:
http://arxiv.org/abs/2308.04258
Autor:
and, Paul Primus, Widmer, Gerhard
Varying conditions between the data seen at training and at application time remain a major challenge for machine learning. We study this problem in the context of Acoustic Scene Classification (ASC) with mismatching recording devices. Previous works
Externí odkaz:
http://arxiv.org/abs/2306.11764
Autor:
Omar Jimenez-Lopez, Tui Ray, Christopher Dean, Ilya Slizovskiy, Jessica Deere, Tiffany Wolf, Seth Moore, Alexander Primus, Jennifer Høy-Petersen, Silje Finstad, Jakob Mo, Henning Sørum, Noelle Noyes
Publikováno v:
One Health, Vol 19, Iss , Pp 100933- (2024)
Anthropogenic activities can significantly impact wildlife in natural water bodies, affecting not only the host's physiology but also its microbiome. This study aimed to analyze the gut microbiome and antimicrobial resistance gene profile (i.e., the
Externí odkaz:
https://doaj.org/article/55ce1c8cb4cf414abbfa083ab1dfc50e