Zobrazeno 1 - 10
of 622 946
pro vyhledávání: '"Yong, P"'
Diffusion-based text-to-audio (TTA) generation has made substantial progress, leveraging latent diffusion model (LDM) to produce high-quality, diverse and instruction-relevant audios. However, beyond generation, the task of audio editing remains equa
Externí odkaz:
http://arxiv.org/abs/2409.12466
Autor:
Zhou, Jiaming, Zhao, Shiwan, He, Jiabei, Wang, Hui, Zeng, Wenjia, Chen, Yong, Sun, Haoqin, Kong, Aobo, Qin, Yong
State-of-the-art models like OpenAI's Whisper exhibit strong performance in multilingual automatic speech recognition (ASR), but they still face challenges in accurately recognizing diverse subdialects. In this paper, we propose M2R-whisper, a novel
Externí odkaz:
http://arxiv.org/abs/2409.11889
We investigate the ergodicity for the stochastic complex Ginzburg-Landau equation with a general non-linear term on the two-dimensional torus driven by a complex-valued space-time white noise. Due to the roughness of complex-valued space-time white n
Externí odkaz:
http://arxiv.org/abs/2408.11568
Autor:
Li, Hang, Zhou, Feng, Ding, Bei, Chen, Jie, Song, Linxuan, Yang, Wenyun, Lau, Yong-Chang, Yang, Jinbo, Li, Yue, Jiang, Yong, Wang, Wenhong
Topological magnetic materials are expected to show multiple transport responses because of their unusual bulk electronic topology in momentum space and topological spin texture in real space. However, such multiple topological properties-hosting mat
Externí odkaz:
http://arxiv.org/abs/2408.00363
Autor:
Bhattacharya, Shohini, Cichy, Krzysztof, Constantinou, Martha, Gao, Xiang, Metz, Andreas, Miller, Joshua, Mukherjee, Swagato, Petreczky, Peter, Steffens, Fernanda, Zhao, Yong
In this work, we present a lattice QCD calculation of the Mellin moments of the twist-2 axial-vector generalized parton distribution (GPD), $\widetilde{H}(x,\xi,t)$, at zero skewness, $\xi$, with multiple values of the momentum transfer, $t$. Our ana
Externí odkaz:
http://arxiv.org/abs/2410.03539
Autor:
Chen, Tianrun, Yu, Chunan, Hu, Yuanqi, Li, Jing, Xu, Tao, Cao, Runlong, Zhu, Lanyun, Zang, Ying, Zhang, Yong, Li, Zejian, Sun, Linyun
In this paper, we propose Img2CAD, the first approach to our knowledge that uses 2D image inputs to generate CAD models with editable parameters. Unlike existing AI methods for 3D model generation using text or image inputs often rely on mesh-based r
Externí odkaz:
http://arxiv.org/abs/2410.03417
Autor:
Jang, Doohyuk, Park, Sihwan, Yang, June Yong, Jung, Yeonsung, Yun, Jihun, Kundu, Souvik, Kim, Sung-Yub, Yang, Eunho
Auto-Regressive (AR) models have recently gained prominence in image generation, often matching or even surpassing the performance of diffusion models. However, one major limitation of AR models is their sequential nature, which processes tokens one
Externí odkaz:
http://arxiv.org/abs/2410.03355
Large Language Models are applied to recommendation tasks such as items to buy and news articles to read. Point of Interest is quite a new area to sequential recommendation based on language representations of multimodal datasets. As a first step to
Externí odkaz:
http://arxiv.org/abs/2410.03265
Autor:
Zhang, Yong-Kun, Li, Di, Feng, Yi, Tsai, Chao-Wei, Wang, Pei, Niu, Chen-Hui, Chen, Hua-Xi, Zhu, Yu-Hao
The detection of fast radio bursts (FRBs) in radio astronomy is a complex task due to the challenges posed by radio frequency interference (RFI) and signal dispersion in the interstellar medium. Traditional search algorithms are often inefficient, ti
Externí odkaz:
http://arxiv.org/abs/2410.03200
Autor:
Chen, Lebing, Ye, Gaihua, Nnokwe, Cynthia, Pan, Xing-Chen, Tanigaki, Katsumi, Cheng, Guanghui, Chen, Yong P., Yan, Jiaqiang, Mandrus, David G., Allcca, Andres E. Llacsahuanga, Giles-Donovan, Nathan, Birgeneau, Robert J., He, Rui
Optical phonon engineering through nonlinear effects has been utilized in ultrafast control of material properties. However, nonlinear optical phonons typically exhibit rapid decay due to strong mode-mode couplings, limiting their effectiveness in te
Externí odkaz:
http://arxiv.org/abs/2410.03128