Zobrazeno 1 - 10
of 4 335
pro vyhledávání: '"YIN, Lu"'
Autor:
Yin, Lu
The observations from pulsar timing arrays (PTAs), led by the North American Nanohertz Observatory for Gravitational Waves (NANOGrav), have provided opportunities to constrain primordial gravitational waves at low frequencies. In this paper, we analy
Externí odkaz:
http://arxiv.org/abs/2410.07949
This paper investigates the under-explored area of low-rank weight training for large-scale Conformer-based speech recognition models from scratch. Our study demonstrates the viability of this training paradigm for such models, yielding several notab
Externí odkaz:
http://arxiv.org/abs/2410.07771
Autor:
Bandari, Abhinav, Yin, Lu, Hsieh, Cheng-Yu, Jaiswal, Ajay Kumar, Chen, Tianlong, Shen, Li, Krishna, Ranjay, Liu, Shiwei
Network pruning has emerged as a potential solution to make LLMs cheaper to deploy. However, existing LLM pruning approaches universally rely on the C4 dataset as the calibration data for calculating pruning scores, leaving its optimality unexplored.
Externí odkaz:
http://arxiv.org/abs/2410.07461
Autor:
Xiao, Qiao, Wu, Boqian, Yin, Lu, Gadzinski, Christopher Neil, Huang, Tianjin, Pechenizkiy, Mykola, Mocanu, Decebal Constantin
While deep learning has demonstrated impressive progress, it remains a daunting challenge to learn from hard samples as these samples are usually noisy and intricate. These hard samples play a crucial role in the optimal performance of deep neural ne
Externí odkaz:
http://arxiv.org/abs/2409.09196
We test the $n$=3 Ultralight Axion-like model of Early Dark Energy (EDE) with the observations of the $EB$ mode of the cosmic microwave background (CMB) radiation, and local expansion rate measurements. We find that the shape of the CMB $EB$ angular
Externí odkaz:
http://arxiv.org/abs/2408.09521
Despite significant advancements in active learning and adversarial attacks, the intersection of these two fields remains underexplored, particularly in developing robust active learning frameworks against dynamic adversarial threats. The challenge o
Externí odkaz:
http://arxiv.org/abs/2408.07364
Autor:
Jaiswal, Ajay, Yin, Lu, Zhang, Zhenyu, Liu, Shiwei, Zhao, Jiawei, Tian, Yuandong, Wang, Zhangyang
Modern Large Language Models (LLMs) are composed of matrices with billions of elements, making their storage and processing quite demanding in terms of computational resources and memory usage. Being significantly large, such matrices can often be ex
Externí odkaz:
http://arxiv.org/abs/2407.11239
Autor:
Zhang, Zhenyu, Jaiswal, Ajay, Yin, Lu, Liu, Shiwei, Zhao, Jiawei, Tian, Yuandong, Wang, Zhangyang
Training Large Language Models (LLMs) is memory-intensive due to the large number of parameters and associated optimization states. GaLore, a recent method, reduces memory usage by projecting weight gradients into a low-rank subspace without compromi
Externí odkaz:
http://arxiv.org/abs/2407.08296
Autor:
Xiao, Qiao, Ma, Pingchuan, Fernandez-Lopez, Adriana, Wu, Boqian, Yin, Lu, Petridis, Stavros, Pechenizkiy, Mykola, Pantic, Maja, Mocanu, Decebal Constantin, Liu, Shiwei
The recent success of Automatic Speech Recognition (ASR) is largely attributed to the ever-growing amount of training data. However, this trend has made model training prohibitively costly and imposed computational demands. While data pruning has bee
Externí odkaz:
http://arxiv.org/abs/2406.18373
Autor:
Fernandez-Lopez, Adriana, Chen, Honglie, Ma, Pingchuan, Yin, Lu, Xiao, Qiao, Petridis, Stavros, Liu, Shiwei, Pantic, Maja
Pre-trained models have been a foundational approach in speech recognition, albeit with associated additional costs. In this study, we propose a regularization technique that facilitates the training of visual and audio-visual speech recognition mode
Externí odkaz:
http://arxiv.org/abs/2406.17614