Zobrazeno 1 - 10
of 2 440
pro vyhledávání: '"Guinan, P"'
Autor:
Bian, Christopher, Cheu, Albert, Chiknavaryan, Stanislav, Gong, Zoe, Gruteser, Marco, Guinan, Oliver, Guzman, Yannis, Kairouz, Peter, Lagzdin, Artem, McKenna, Ryan, Ni, Grace, Roth, Edo, Spivak, Maya, Van Overveldt, Timon, Yi, Ren
This paper introduces Mayfly, a federated analytics approach enabling aggregate queries over ephemeral on-device data streams without central persistence of sensitive user data. Mayfly minimizes data via on-device windowing and contribution bounding
Externí odkaz:
http://arxiv.org/abs/2412.07962
Autor:
Guo, Xiuyuan, Xu, Chengqi, Guo, Guinan, Zhu, Feiyu, Cai, Changpeng, Wang, Peizhe, Wei, Xiaoming, Su, Junhao, Gao, Jialin
Currently, training large-scale deep learning models is typically achieved through parallel training across multiple GPUs. However, due to the inherent communication overhead and synchronization delays in traditional model parallelism methods, seamle
Externí odkaz:
http://arxiv.org/abs/2411.12780
Autor:
Geng, Mengzhe, Xie, Xurong, Deng, Jiajun, Jin, Zengrui, Li, Guinan, Wang, Tianzi, Hu, Shujie, Li, Zhaoqing, Meng, Helen, Liu, Xunying
The application of data-intensive automatic speech recognition (ASR) technologies to dysarthric and elderly adult speech is confronted by their mismatch against healthy and nonaged voices, data scarcity and large speaker-level variability. To this en
Externí odkaz:
http://arxiv.org/abs/2407.06310
Autor:
Hu, Shujie, Xie, Xurong, Geng, Mengzhe, Jin, Zengrui, Deng, Jiajun, Li, Guinan, Wang, Yi, Cui, Mingyu, Wang, Tianzi, Meng, Helen, Liu, Xunying
Self-supervised learning (SSL) based speech foundation models have been applied to a wide range of ASR tasks. However, their application to dysarthric and elderly speech via data-intensive parameter fine-tuning is confronted by in-domain data scarcit
Externí odkaz:
http://arxiv.org/abs/2407.13782
Autor:
Li, Guinan, Deng, Jiajun, Chen, Youjun, Geng, Mengzhe, Hu, Shujie, Li, Zhe, Jin, Zengrui, Wang, Tianzi, Xie, Xurong, Meng, Helen, Liu, Xunying
This paper proposes joint speaker feature learning methods for zero-shot adaptation of audio-visual multichannel speech separation and recognition systems. xVector and ECAPA-TDNN speaker encoders are connected using purpose-built fusion blocks and ti
Externí odkaz:
http://arxiv.org/abs/2406.10152
Autor:
Wang, Tianzi, Xie, Xurong, Li, Zhaoqing, Hu, Shoukang, Jin, Zengrui, Deng, Jiajun, Cui, Mingyu, Hu, Shujie, Geng, Mengzhe, Li, Guinan, Meng, Helen, Liu, Xunying
This paper proposes a novel non-autoregressive (NAR) block-based Attention Mask Decoder (AMD) that flexibly balances performance-efficiency trade-offs for Conformer ASR systems. AMD performs parallel NAR inference within contiguous blocks of output l
Externí odkaz:
http://arxiv.org/abs/2406.10034
Autor:
Cai, Changpeng, Guo, Guinan, Li, Jiao, Su, Junhao, Shen, Fei, He, Chenghao, Xiao, Jing, Chen, Yuanxu, Dai, Lei, Zhu, Feiyu
Most earlier researches on talking face generation have focused on the synchronization of lip motion and speech content. However, head pose and facial emotions are equally important characteristics of natural faces. While audio-driven talking face ge
Externí odkaz:
http://arxiv.org/abs/2405.07257
Autor:
Clinton, Nicholas, Vollrath, Andreas, D'annunzio, Remi, Liu, Desheng, Glick, Henry B., Descals, Adrià, Sullivan, Alicia, Guinan, Oliver, Abramowitz, Jacob, Stolle, Fred, Goodman, Chris, Birch, Tanya, Quinn, David, Danylo, Olga, Lips, Tijs, Coelho, Daniel, Bihari, Enikoe, Cronkite-Ratcliff, Bryce, Poortinga, Ate, Haghighattalab, Atena, Notman, Evan, DeWitt, Michael, Yonas, Aaron, Donchyts, Gennadii, Shah, Devaja, Saah, David, Tenneson, Karis, Quyen, Nguyen Hanh, Verma, Megha, Wilcox, Andrew
Palm oil production has been identified as one of the major drivers of deforestation for tropical countries. To meet supply chain objectives, commodity producers and other stakeholders need timely information of land cover dynamics in their supply sh
Externí odkaz:
http://arxiv.org/abs/2405.09530
Autor:
Yang, Yanwu, Ye, Chenfei, Su, Guinan, Zhang, Ziyao, Chang, Zhikai, Chen, Hairui, Chan, Piu, Yu, Yue, Ma, Ting
Foundation models pretrained on large-scale datasets via self-supervised learning demonstrate exceptional versatility across various tasks. Due to the heterogeneity and hard-to-collect medical data, this approach is especially beneficial for medical
Externí odkaz:
http://arxiv.org/abs/2403.01433
Autor:
Wang, Huimeng, Jin, Zengrui, Geng, Mengzhe, Hu, Shujie, Li, Guinan, Wang, Tianzi, Xu, Haoning, Liu, Xunying
Automatic recognition of dysarthric speech remains a highly challenging task to date. Neuro-motor conditions and co-occurring physical disabilities create difficulty in large-scale data collection for ASR system development. Adapting SSL pre-trained
Externí odkaz:
http://arxiv.org/abs/2401.00662