Zobrazeno 1 - 10
of 300
pro vyhledávání: '"Ghodsi, Ali"'
Autor:
Rajabzadeh, Hossein, Jafari, Aref, Sharma, Aman, Jami, Benyamin, Kwon, Hyock Ju, Ghodsi, Ali, Chen, Boxing, Rezagholizadeh, Mehdi
Large Language Models (LLMs), with their increasing depth and number of parameters, have demonstrated outstanding performance across a variety of natural language processing tasks. However, this growth in scale leads to increased computational demand
Externí odkaz:
http://arxiv.org/abs/2409.14595
Autor:
Kavehzadeh, Parsa, Pourreza, Mohammadreza, Valipour, Mojtaba, Zhu, Tinashu, Bai, Haoli, Ghodsi, Ali, Chen, Boxing, Rezagholizadeh, Mehdi
Deployment of autoregressive large language models (LLMs) is costly, and as these models increase in size, the associated costs will become even more considerable. Consequently, different methods have been proposed to accelerate the token generation
Externí odkaz:
http://arxiv.org/abs/2407.01955
Quantitative systems pharmacology (QSP) is widely used to assess drug effects and toxicity before the drug goes to clinical trial. However, significant manual distillation of the literature is needed in order to construct a QSP model. Parameters may
Externí odkaz:
http://arxiv.org/abs/2404.08019
Autor:
Karami, Mahdi, Ghodsi, Ali
In the rapidly evolving field of deep learning, the demand for models that are both expressive and computationally efficient has never been more critical. This paper introduces Orchid, a novel architecture designed to address the quadratic complexity
Externí odkaz:
http://arxiv.org/abs/2402.18508
Autor:
Rajabzadeh, Hossein, Valipour, Mojtaba, Zhu, Tianshu, Tahaei, Marzieh, Kwon, Hyock Ju, Ghodsi, Ali, Chen, Boxing, Rezagholizadeh, Mehdi
Finetuning large language models requires huge GPU memory, restricting the choice to acquire Larger models. While the quantized version of the Low-Rank Adaptation technique, named QLoRA, significantly alleviates this issue, finding the efficient LoRA
Externí odkaz:
http://arxiv.org/abs/2402.10462
In regularization Self-Supervised Learning (SSL) methods for graphs, computational complexity increases with the number of nodes in graphs and embedding dimensions. To mitigate the scalability of non-contrastive graph SSL, we propose a novel approach
Externí odkaz:
http://arxiv.org/abs/2402.09603
WERank: Towards Rank Degradation Prevention for Self-Supervised Learning Using Weight Regularization
A common phenomena confining the representation quality in Self-Supervised Learning (SSL) is dimensional collapse (also known as rank degeneration), where the learned representations are mapped to a low dimensional subspace of the representation spac
Externí odkaz:
http://arxiv.org/abs/2402.09586
Autor:
Kavehzadeh, Parsa, Valipour, Mojtaba, Tahaei, Marzieh, Ghodsi, Ali, Chen, Boxing, Rezagholizadeh, Mehdi
Large language models (LLMs) have revolutionized natural language processing (NLP) by excelling at understanding and generating human-like text. However, their widespread deployment can be prohibitively expensive. SortedNet is a recent training techn
Externí odkaz:
http://arxiv.org/abs/2309.08968
Autor:
Valipour, Mojtaba, Rezagholizadeh, Mehdi, Rajabzadeh, Hossein, Kavehzadeh, Parsa, Tahaei, Marzieh, Chen, Boxing, Ghodsi, Ali
Deep neural networks (DNNs) must cater to a variety of users with different performance needs and budgets, leading to the costly practice of training, storing, and maintaining numerous user/task-specific models. There are solutions in the literature
Externí odkaz:
http://arxiv.org/abs/2309.00255
Autor:
Ghojogh, Benyamin, Ghodsi, Ali
This is a tutorial paper on Recurrent Neural Network (RNN), Long Short-Term Memory Network (LSTM), and their variants. We start with a dynamical system and backpropagation through time for RNN. Then, we discuss the problems of gradient vanishing and
Externí odkaz:
http://arxiv.org/abs/2304.11461