Zobrazeno 1 - 10
of 3 541
pro vyhledávání: '"P. Ragni"'
Autor:
Cross, Mattias, Ragni, Anton
Diffusion Models (DMs) iteratively denoise random samples to produce high-quality data. The iterative sampling process is derived from Stochastic Differential Equations (SDEs), allowing a speed-quality trade-off chosen at inference. Another advantage
Externí odkaz:
http://arxiv.org/abs/2409.06364
Autor:
Ma, Yinghao, Øland, Anders, Ragni, Anton, Del Sette, Bleiz MacSen, Saitis, Charalampos, Donahue, Chris, Lin, Chenghua, Plachouras, Christos, Benetos, Emmanouil, Shatri, Elona, Morreale, Fabio, Zhang, Ge, Fazekas, György, Xia, Gus, Zhang, Huan, Manco, Ilaria, Huang, Jiawen, Guinot, Julien, Lin, Liwei, Marinelli, Luca, Lam, Max W. Y., Sharma, Megha, Kong, Qiuqiang, Dannenberg, Roger B., Yuan, Ruibin, Wu, Shangda, Wu, Shih-Lun, Dai, Shuqi, Lei, Shun, Kang, Shiyin, Dixon, Simon, Chen, Wenhu, Huang, Wenhao, Du, Xingjian, Qu, Xingwei, Tan, Xu, Li, Yizhi, Tian, Zeyue, Wu, Zhiyong, Wu, Zhizheng, Ma, Ziyang, Wang, Ziyu
In recent years, foundation models (FMs) such as large language models (LLMs) and latent diffusion models (LDMs) have profoundly impacted diverse sectors, including music. This comprehensive review examines state-of-the-art (SOTA) pre-trained models
Externí odkaz:
http://arxiv.org/abs/2408.14340
Autor:
Flynn, Robert, Ragni, Anton
When there is a mismatch between the training and test domains, current speech recognition systems show significant performance degradation. Self-training methods, such as noisy student teacher training, can help address this and enable the adaptatio
Externí odkaz:
http://arxiv.org/abs/2406.12937
Automatic speech recognition (ASR) research has achieved impressive performance in recent years and has significant potential for enabling access for people with dysarthria (PwD) in augmentative and alternative communication (AAC) and home environmen
Externí odkaz:
http://arxiv.org/abs/2406.08568
Autor:
Mosleth, Ellen Færgestad, Dankel, Simon Erling Nitter, Mellgren, Gunnar, Olmos, Francisco Martin Barajas, Orozco, Lorena Sofia, Lysenko, Artem, Ofstad, Ragni, Begum, Most Champa, Martens, Harald, Liland, Kristian Hovde
General Effect Modelling (GEM) is an umbrella over different methods that utilise effects in the analyses of data with multiple design variables and multivariate responses. To demonstrate the methodology, we here use GEM in gene expression data where
Externí odkaz:
http://arxiv.org/abs/2404.03029
Autor:
Klimin, Serghei N., Tempere, Jacques, Houtput, Matthew, Ragni, Stefano, Hahn, Thomas, Franchini, Cesare, Mishchenko, Andrey S.
Publikováno v:
Phys. Rev. B 110, 075107 (2024)
Including the effect of lattice anharmonicity on electron-phonon interactions has recently garnered attention due to its role as a necessary and significant component in explaining various phenomena, including superconductivity, optical response, and
Externí odkaz:
http://arxiv.org/abs/2403.18019
Autor:
Mogridge, Rhiannon, Close, George, Sutherland, Robert, Hain, Thomas, Barker, Jon, Goetze, Stefan, Ragni, Anton
Neural networks have been successfully used for non-intrusive speech intelligibility prediction. Recently, the use of feature representations sourced from intermediate layers of pre-trained self-supervised and weakly-supervised models has been found
Externí odkaz:
http://arxiv.org/abs/2401.13611
Autor:
Flynn, Robert, Ragni, Anton
For the task of speech recognition, the use of more than 30 seconds of acoustic context during training is uncommon and under-investigated in literature. In this work, we conduct an empirical study on the effect of scaling the sequence length used to
Externí odkaz:
http://arxiv.org/abs/2310.15672
Recently there has been a lot of interest in non-autoregressive (non-AR) models for speech synthesis, such as FastSpeech 2 and diffusion models. Unlike AR models, these models do not have autoregressive dependencies among outputs which makes inferenc
Externí odkaz:
http://arxiv.org/abs/2310.12765
Autor:
Baars, Woutijn J., Ragni, Daniele
Acoustic spectra of rotor noise yield frequency-distributions of energy within pressure time series. However, they are unable to reveal phase-relations between different frequency components, while the latter play a role in low-frequency intensity mo
Externí odkaz:
http://arxiv.org/abs/2310.01056