Zobrazeno 1 - 10
of 106
pro vyhledávání: '"Gao Yingming"'
Publikováno v:
High Temperature Materials and Processes, Vol 41, Iss 1, Pp 364-374 (2022)
The forming-induced residual stress of metallic parts could cause undesired deformation in the final machining process, especially for the thin-walled parts. Therefore, heat treatment is essential to release the residual stress prior to machining. Th
Externí odkaz:
https://doaj.org/article/34cfe3e292a8436991cea6f6479704cf
Publikováno v:
Shanghai Jiaotong Daxue xuebao, Vol 55, Iss 03, Pp 229-235 (2021)
Aimed at the uncertainty of equipment quantity and input, the defective products and its rework in the manufacturing process are investigated. Considering the influence of the number or input of the devices on system reliability, a manufacturing syst
Externí odkaz:
https://doaj.org/article/35c66fe0829c4a76ba868ddd1f3fdd3a
To address the limitation in multimodal emotion recognition (MER) performance arising from inter-modal information fusion, we propose a novel MER framework based on multitask learning where fusion occurs after alignment, called Foal-Net. The framewor
Externí odkaz:
http://arxiv.org/abs/2408.09438
Autor:
Fu, Ruibo, Liu, Rui, Qiang, Chunyu, Gao, Yingming, Lu, Yi, Shi, Shuchen, Wang, Tao, Li, Ya, Wen, Zhengqi, Zhang, Chen, Bu, Hui, Liu, Yukun, Qi, Xin, Li, Guanjun
The Inspirational and Convincing Audio Generation Challenge 2024 (ICAGC 2024) is part of the ISCSLP 2024 Competitions and Challenges track. While current text-to-speech (TTS) technology can generate high-quality audio, its ability to convey complex e
Externí odkaz:
http://arxiv.org/abs/2407.12038
Diffusion-based singing voice conversion (SVC) models have shown better synthesis quality compared to traditional methods. However, in cross-domain SVC scenarios, where there is a significant disparity in pitch between the source and target voice dom
Externí odkaz:
http://arxiv.org/abs/2406.05692
Recent prompt-based text-to-speech (TTS) models can clone an unseen speaker using only a short speech prompt. They leverage a strong in-context ability to mimic the speech prompts, including speaker style, prosody, and emotion. Therefore, the selecti
Externí odkaz:
http://arxiv.org/abs/2406.03714
Recent advances in large language models (LLMs) and development of audio codecs greatly propel the zero-shot TTS. They can synthesize personalized speech with only a 3-second speech of an unseen speaker as acoustic prompt. However, they only support
Externí odkaz:
http://arxiv.org/abs/2406.03706
Recent advancements in diffusion models and large language models (LLMs) have significantly propelled the field of AIGC. Text-to-Audio (TTA), a burgeoning AIGC application designed to generate audio from natural language prompts, is attracting increa
Externí odkaz:
http://arxiv.org/abs/2401.01044
Speech emotion recognition (SER) systems aim to recognize human emotional state during human-computer interaction. Most existing SER systems are trained based on utterance-level labels. However, not all frames in an audio have affective states consis
Externí odkaz:
http://arxiv.org/abs/2312.16383
Autor:
Deng, Yayue, Xue, Jinlong, Jia, Yukang, Li, Qifei, Han, Yichen, Wang, Fengping, Gao, Yingming, Ke, Dengfeng, Li, Ya
Conversational speech synthesis (CSS) incorporates historical dialogue as supplementary information with the aim of generating speech that has dialogue-appropriate prosody. While previous methods have already delved into enhancing context comprehensi
Externí odkaz:
http://arxiv.org/abs/2312.10358