Výsledky vyhledávání - "Scharenborg, Odette"

Report

Self-supervised Speech Representations Still Struggle with African American Vernacular English

Autor: Chang, Kalvin, Chou, Yi-Hui, Shi, Jiatong, Chen, Hsuan-Ming, Holliday, Nicole, Scharenborg, Odette, Mortensen, David R.

Underperformance of ASR systems for speakers of African American Vernacular English (AAVE) and other marginalized language varieties is a well-documented phenomenon, and one that reinforces the stigmatization of these varieties. We investigate whethe

Externí odkaz: http://arxiv.org/abs/2408.14262

Zobrazit plný text záznamu

Report

As Biased as You Measure: Methodological Pitfalls of Bias Evaluations in Speaker Verification Research

Autor: Hutiri, Wiebke, Patel, Tanvina, Ding, Aaron Yi, Scharenborg, Odette

Detecting and mitigating bias in speaker verification systems is important, as datasets, processing choices and algorithms can lead to performance differences that systematically favour some groups of people while disadvantaging others. Prior studies

Externí odkaz: http://arxiv.org/abs/2408.13614

Zobrazit plný text záznamu

Report

Improving child speech recognition with augmented child-like speech

Autor: Zhang, Yuanyuan, Yue, Zhengjun, Patel, Tanvina, Scharenborg, Odette

State-of-the-art ASRs show suboptimal performance for child speech. The scarcity of child speech limits the development of child speech recognition (CSR). Therefore, we studied child-to-child voice conversion (VC) from existing child speakers in the

Externí odkaz: http://arxiv.org/abs/2406.10284

Zobrazit plný text záznamu

Report

Exploring data augmentation in bias mitigation against non-native-accented speech

Autor: Zhang, Yuanyuan, Herygers, Aaricia, Patel, Tanvina, Yue, Zhengjun, Scharenborg, Odette

Automatic speech recognition (ASR) should serve every speaker, not only the majority ``standard'' speakers of a language. In order to build inclusive ASR, mitigating the bias against speaker groups who speak in a ``non-standard'' or ``diverse'' way i

Externí odkaz: http://arxiv.org/abs/2312.15499

Zobrazit plný text záznamu

Report

Improving Whispered Speech Recognition Performance using Pseudo-whispered based Data Augmentation

Autor: Lin, Zhaofeng, Patel, Tanvina, Scharenborg, Odette

Whispering is a distinct form of speech known for its soft, breathy, and hushed characteristics, often used for private communication. The acoustic characteristics of whispered speech differ substantially from normally phonated speech and the scarcit

Externí odkaz: http://arxiv.org/abs/2311.05179

Zobrazit plný text záznamu

Report

The Multimodal Information Based Speech Processing (MISP) 2023 Challenge: Audio-Visual Target Speaker Extraction

Autor: Wu, Shilong, Wang, Chenxi, Chen, Hang, Dai, Yusheng, Zhang, Chenyue, Wang, Ruoyu, Lan, Hongbo, Du, Jun, Lee, Chin-Hui, Chen, Jingdong, Watanabe, Shinji, Siniscalchi, Sabato Marco, Scharenborg, Odette, Wang, Zhong-Qiu, Pan, Jia, Gao, Jianqing

Previous Multimodal Information based Speech Processing (MISP) challenges mainly focused on audio-visual speech recognition (AVSR) with commendable success. However, the most advanced back-end recognition systems often hit performance limits due to t

Externí odkaz: http://arxiv.org/abs/2309.08348

Zobrazit plný text záznamu

Report

Using Data Augmentations and VTLN to Reduce Bias in Dutch End-to-End Speech Recognition Systems

Autor: Patel, Tanvina, Scharenborg, Odette

Speech technology has improved greatly for norm speakers, i.e., adult native speakers of a language without speech impediments or strong accents. However, non-norm or diverse speaker groups show a distinct performance gap with norm speakers, which we

Externí odkaz: http://arxiv.org/abs/2307.02009

Zobrazit plný text záznamu

Report

The Multimodal Information based Speech Processing (MISP) 2022 Challenge: Audio-Visual Diarization and Recognition

Autor: Wang, Zhe, Wu, Shilong, Chen, Hang, He, Mao-Kui, Du, Jun, Lee, Chin-Hui, Chen, Jingdong, Watanabe, Shinji, Siniscalchi, Sabato, Scharenborg, Odette, Liu, Diyuan, Yin, Baocai, Pan, Jia, Gao, Jianqing, Liu, Cong

The Multi-modal Information based Speech Processing (MISP) challenge aims to extend the application of signal processing technology in specific scenarios by promoting the research into wake-up words, speaker diarization, speech recognition, and other

Externí odkaz: http://arxiv.org/abs/2303.06326

Zobrazit plný text záznamu

Report

Predicting within and across language phoneme recognition performance of self-supervised learning speech pre-trained models

Autor: Ji, Hang, Patel, Tanvina, Scharenborg, Odette

In this work, we analyzed and compared speech representations extracted from different frozen self-supervised learning (SSL) speech pre-trained models on their ability to capture articulatory features (AF) information and their subsequent prediction

Externí odkaz: http://arxiv.org/abs/2206.12489

Zobrazit plný text záznamu

Report

Manipulation of oral cancer speech using neural articulatory synthesis

Autor: Halpern, Bence Mark, Rebernik, Teja, Tienkamp, Thomas, van Son, Rob, Brekel, Michiel van den, Wieling, Martijn, Witjes, Max, Scharenborg, Odette

We present an articulatory synthesis framework for the synthesis and manipulation of oral cancer speech for clinical decision making and alleviation of patient stress. Objective and subjective evaluations demonstrate that the framework has acceptable

Externí odkaz: http://arxiv.org/abs/2203.17072

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání