Zobrazeno 1 - 10
of 199
pro vyhledávání: '"Scharenborg, Odette"'
Autor:
Chang, Kalvin, Chou, Yi-Hui, Shi, Jiatong, Chen, Hsuan-Ming, Holliday, Nicole, Scharenborg, Odette, Mortensen, David R.
Underperformance of ASR systems for speakers of African American Vernacular English (AAVE) and other marginalized language varieties is a well-documented phenomenon, and one that reinforces the stigmatization of these varieties. We investigate whethe
Externí odkaz:
http://arxiv.org/abs/2408.14262
Detecting and mitigating bias in speaker verification systems is important, as datasets, processing choices and algorithms can lead to performance differences that systematically favour some groups of people while disadvantaging others. Prior studies
Externí odkaz:
http://arxiv.org/abs/2408.13614
State-of-the-art ASRs show suboptimal performance for child speech. The scarcity of child speech limits the development of child speech recognition (CSR). Therefore, we studied child-to-child voice conversion (VC) from existing child speakers in the
Externí odkaz:
http://arxiv.org/abs/2406.10284
Automatic speech recognition (ASR) should serve every speaker, not only the majority ``standard'' speakers of a language. In order to build inclusive ASR, mitigating the bias against speaker groups who speak in a ``non-standard'' or ``diverse'' way i
Externí odkaz:
http://arxiv.org/abs/2312.15499
Whispering is a distinct form of speech known for its soft, breathy, and hushed characteristics, often used for private communication. The acoustic characteristics of whispered speech differ substantially from normally phonated speech and the scarcit
Externí odkaz:
http://arxiv.org/abs/2311.05179
Autor:
Wu, Shilong, Wang, Chenxi, Chen, Hang, Dai, Yusheng, Zhang, Chenyue, Wang, Ruoyu, Lan, Hongbo, Du, Jun, Lee, Chin-Hui, Chen, Jingdong, Watanabe, Shinji, Siniscalchi, Sabato Marco, Scharenborg, Odette, Wang, Zhong-Qiu, Pan, Jia, Gao, Jianqing
Previous Multimodal Information based Speech Processing (MISP) challenges mainly focused on audio-visual speech recognition (AVSR) with commendable success. However, the most advanced back-end recognition systems often hit performance limits due to t
Externí odkaz:
http://arxiv.org/abs/2309.08348
Autor:
Patel, Tanvina, Scharenborg, Odette
Speech technology has improved greatly for norm speakers, i.e., adult native speakers of a language without speech impediments or strong accents. However, non-norm or diverse speaker groups show a distinct performance gap with norm speakers, which we
Externí odkaz:
http://arxiv.org/abs/2307.02009
Autor:
Wang, Zhe, Wu, Shilong, Chen, Hang, He, Mao-Kui, Du, Jun, Lee, Chin-Hui, Chen, Jingdong, Watanabe, Shinji, Siniscalchi, Sabato, Scharenborg, Odette, Liu, Diyuan, Yin, Baocai, Pan, Jia, Gao, Jianqing, Liu, Cong
The Multi-modal Information based Speech Processing (MISP) challenge aims to extend the application of signal processing technology in specific scenarios by promoting the research into wake-up words, speaker diarization, speech recognition, and other
Externí odkaz:
http://arxiv.org/abs/2303.06326
In this work, we analyzed and compared speech representations extracted from different frozen self-supervised learning (SSL) speech pre-trained models on their ability to capture articulatory features (AF) information and their subsequent prediction
Externí odkaz:
http://arxiv.org/abs/2206.12489
Autor:
Halpern, Bence Mark, Rebernik, Teja, Tienkamp, Thomas, van Son, Rob, Brekel, Michiel van den, Wieling, Martijn, Witjes, Max, Scharenborg, Odette
We present an articulatory synthesis framework for the synthesis and manipulation of oral cancer speech for clinical decision making and alleviation of patient stress. Objective and subjective evaluations demonstrate that the framework has acceptable
Externí odkaz:
http://arxiv.org/abs/2203.17072