Výsledky vyhledávání

Report

An Explicit Consistency-Preserving Loss Function for Phase Reconstruction and Speech Enhancement

Autor: Ku, Pin-Jui, Ho, Chun-Wei, Yen, Hao, Siniscalchi, Sabato Marco, Lee, Chin-Hui

In this work, we propose a novel consistency-preserving loss function for recovering the phase information in the context of phase reconstruction (PR) and speech enhancement (SE). Different from conventional techniques that directly estimate the phas

Externí odkaz: http://arxiv.org/abs/2409.16282

Zobrazit plný text záznamu

Report

A Study on Zero-shot Non-intrusive Speech Assessment using Large Language Models

Autor: Zezario, Ryandhimas E., Siniscalchi, Sabato M., Wang, Hsin-Min, Tsao, Yu

This work investigates two strategies for zero-shot non-intrusive speech assessment leveraging large language models. First, we explore the audio analysis capabilities of GPT-4o. Second, we propose GPT-Whisper, which uses Whisper as an audio-to-text

Externí odkaz: http://arxiv.org/abs/2409.09914

Zobrazit plný text záznamu

Report

Large Language Model Based Generative Error Correction: A Challenge and Baselines for Speech Recognition, Speaker Tagging, and Emotion Recognition

Given recent advances in generative AI technology, a key question is how large language models (LLMs) can enhance acoustic modeling tasks using text decoding results from a frozen, pretrained automatic speech recognition (ASR) model. To explore new c

Externí odkaz: http://arxiv.org/abs/2409.09785

Zobrazit plný text záznamu

Report

Exploiting Consistency-Preserving Loss and Perceptual Contrast Stretching to Boost SSL-based Speech Enhancement

Autor: Khan, Muhammad Salman, La Quatra, Moreno, Hung, Kuo-Hsuan, Fu, Szu-Wei, Siniscalchi, Sabato Marco, Tsao, Yu

Self-supervised representation learning (SSL) has attained SOTA results on several downstream speech tasks, but SSL-based speech enhancement (SE) solutions still lag behind. To address this issue, we exploit three main ideas: (i) Transformer-based ma

Externí odkaz: http://arxiv.org/abs/2408.04773

Zobrazit plný text záznamu

Report

Exploiting Foundation Models and Speech Enhancement for Parkinson's Disease Detection from Speech in Real-World Operative Conditions

Autor: La Quatra, Moreno, Turco, Maria Francesca, Svendsen, Torbjørn, Salvi, Giampiero, Orozco-Arroyave, Juan Rafael, Siniscalchi, Sabato Marco

This work is concerned with devising a robust Parkinson's (PD) disease detector from speech in real-world operating conditions using (i) foundational models, and (ii) speech enhancement (SE) methods. To this end, we first fine-tune several foundation

Externí odkaz: http://arxiv.org/abs/2406.16128

Zobrazit plný text záznamu

Report

Speech Analysis of Language Varieties in Italy

Autor: La Quatra, Moreno, Koudounas, Alkis, Baralis, Elena, Siniscalchi, Sabato Marco

Italy exhibits rich linguistic diversity across its territory due to the distinct regional languages spoken in different areas. Recent advances in self-supervised learning provide new opportunities to analyze Italy's linguistic varieties using speech

Externí odkaz: http://arxiv.org/abs/2406.15862

Zobrazit plný text záznamu

Report

Language-Universal Speech Attributes Modeling for Zero-Shot Multilingual Spoken Keyword Recognition

Autor: Yen, Hao, Ku, Pin-Jui, Siniscalchi, Sabato Marco, Lee, Chin-Hui

We propose a novel language-universal approach to end-to-end automatic spoken keyword recognition (SKR) leveraging upon (i) a self-supervised pre-trained model, and (ii) a set of universal speech attributes (manner and place of articulation). Specifi

Externí odkaz: http://arxiv.org/abs/2406.02488

Zobrazit plný text záznamu

Akademický článek

Il principio di solidarietà alla luce della Riforma del Terzo settore

Autor: Sabato Aliberti

Publikováno v: Culture e Studi del Sociale, Vol 4, Iss 2, Pp 253-259 (2019)

The aim of this article is to provide some insights on the significance of the principle of solidarity, the beating heart of voluntary action, within the recent reform of the Third Sector in Italy. If and in what way it has been exploited for the pur

Externí odkaz: https://doaj.org/article/a8ae090913614cc3a9c29c6779bd88b9

Zobrazit plný text záznamu

Report

An Investigation of Incorporating Mamba for Speech Enhancement

Autor: Chao, Rong, Cheng, Wen-Huang, La Quatra, Moreno, Siniscalchi, Sabato Marco, Yang, Chao-Han Huck, Fu, Szu-Wei, Tsao, Yu

This work aims to study a scalable state-space model (SSM), Mamba, for the speech enhancement (SE) task. We exploit a Mamba-based regression model to characterize speech signals and build an SE system upon Mamba, termed SEMamba. We explore the proper

Externí odkaz: http://arxiv.org/abs/2405.06573

Zobrazit plný text záznamu

Report

Benchmarking Representations for Speech, Music, and Acoustic Events

Autor: La Quatra, Moreno, Koudounas, Alkis, Vaiani, Lorenzo, Baralis, Elena, Cagliero, Luca, Garza, Paolo, Siniscalchi, Sabato Marco

Limited diversity in standardized benchmarks for evaluating audio representation learning (ARL) methods may hinder systematic comparison of current methods' capabilities. We present ARCH, a comprehensive benchmark for evaluating ARL methods on divers

Externí odkaz: http://arxiv.org/abs/2405.00934

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání