Zobrazeno 1 - 10
of 374
pro vyhledávání: '"MENG Lingwei"'
Autor:
Chen Long, Song Wenlong, Sun Tao, Lu Yizhu, Jiang Wei, Liu Jun, Liu Hongjie, Feng Tianshi, Gui Rongjie, Haider Abbas, Meng Lingwei, Lin Shengjie, He Qian
Publikováno v:
Remote Sensing, Vol 15, Iss 20, p 4900 (2023)
Accurate extraction of farmland boundaries is crucial for improving the efficiency of farmland surveys, achieving precise agricultural management, enhancing farmers’ production conditions, protecting the ecological environment, and promoting local
Externí odkaz:
https://doaj.org/article/7b0b73aa94704213af10b9ae91ae00cd
Autor:
Li, Zongyi, Hu, Shujie, Liu, Shujie, Zhou, Long, Choi, Jeongsoo, Meng, Lingwei, Guo, Xun, Li, Jinyu, Ling, Hefei, Wei, Furu
Text-to-video models have recently undergone rapid and substantial advancements. Nevertheless, due to limitations in data and computational resources, achieving efficient generation of long videos with rich motion dynamics remains a significant chall
Externí odkaz:
http://arxiv.org/abs/2410.20502
Autor:
Kang, Jiawen, Han, Dongrui, Meng, Lingwei, Zhou, Jingyan, Li, Jinchao, Wu, Xixin, Meng, Helen
Alzheimer's Disease (AD) detection has emerged as a promising research area that employs machine learning classification models to distinguish between individuals with AD and those without. Unlike conventional classification tasks, we identify within
Externí odkaz:
http://arxiv.org/abs/2409.16322
Autor:
Kang, Jiawen, Meng, Lingwei, Cui, Mingyu, Wang, Yuejiao, Wu, Xixin, Liu, Xunying, Meng, Helen
Multi-talker speech recognition (MTASR) faces unique challenges in disentangling and transcribing overlapping speech. To address these challenges, this paper investigates the role of Connectionist Temporal Classification (CTC) in speaker disentanglem
Externí odkaz:
http://arxiv.org/abs/2409.12388
Autor:
Meng, Lingwei, Hu, Shujie, Kang, Jiawen, Li, Zhaoqing, Wang, Yuejiao, Wu, Wenxuan, Wu, Xixin, Liu, Xunying, Meng, Helen
Recent advancements in large language models (LLMs) have revolutionized various domains, bringing significant progress and new opportunities. Despite progress in speech-related tasks, LLMs have not been sufficiently explored in multi-talker scenarios
Externí odkaz:
http://arxiv.org/abs/2409.08596
Autor:
Jin, Zengrui, Yang, Yifan, Shi, Mohan, Kang, Wei, Yang, Xiaoyu, Yao, Zengwei, Kuang, Fangjun, Guo, Liyong, Meng, Lingwei, Lin, Long, Xu, Yong, Zhang, Shi-Xiong, Povey, Daniel
The evolving speech processing landscape is increasingly focused on complex scenarios like meetings or cocktail parties with multiple simultaneous speakers and far-field conditions. Existing methodologies for addressing these challenges fall into two
Externí odkaz:
http://arxiv.org/abs/2409.00819
Functional magnetic resonance imaging (fMRI) is essential for developing encoding models that identify functional changes in language-related brain areas of individuals with Neurocognitive Disorders (NCD). While large language model (LLM)-based fMRI
Externí odkaz:
http://arxiv.org/abs/2407.10376
Autor:
Meng, Lingwei, Kang, Jiawen, Wang, Yuejiao, Jin, Zengrui, Wu, Xixin, Liu, Xunying, Meng, Helen
Multi-talker speech recognition and target-talker speech recognition, both involve transcription in multi-talker contexts, remain significant challenges. However, existing methods rarely attempt to simultaneously address both tasks. In this study, we
Externí odkaz:
http://arxiv.org/abs/2407.09817
Autor:
Meng, Lingwei, Zhou, Long, Liu, Shujie, Chen, Sanyuan, Han, Bing, Hu, Shujie, Liu, Yanqing, Li, Jinyu, Zhao, Sheng, Wu, Xixin, Meng, Helen, Wei, Furu
We present MELLE, a novel continuous-valued tokens based language modeling approach for text to speech synthesis (TTS). MELLE autoregressively generates continuous mel-spectrogram frames directly from text condition, bypassing the need for vector qua
Externí odkaz:
http://arxiv.org/abs/2407.08551
Autor:
Han, Bing, Zhou, Long, Liu, Shujie, Chen, Sanyuan, Meng, Lingwei, Qian, Yanming, Liu, Yanqing, Zhao, Sheng, Li, Jinyu, Wei, Furu
With the help of discrete neural audio codecs, large language models (LLM) have increasingly been recognized as a promising methodology for zero-shot Text-to-Speech (TTS) synthesis. However, sampling based decoding strategies bring astonishing divers
Externí odkaz:
http://arxiv.org/abs/2406.07855