Zobrazeno 1 - 10
of 1 922
pro vyhledávání: '"Chandar P"'
Despite their widespread adoption, large language models (LLMs) remain prohibitive to use under resource constraints, with their ever growing sizes only increasing the barrier for use. One noted issue is the high latency associated with auto-regressi
Externí odkaz:
http://arxiv.org/abs/2408.08470
3D sensing is a fundamental task for Autonomous Vehicles. Its deployment often relies on aligned RGB cameras and LiDAR. Despite meticulous synchronization and calibration, systematic misalignment persists in LiDAR projected depthmap. This is due to t
Externí odkaz:
http://arxiv.org/abs/2407.19154
The increasing scale of Transformer models has led to an increase in their pre-training computational requirements. While quantization has proven to be effective after pre-training and during fine-tuning, applying quantization in Transformers during
Externí odkaz:
http://arxiv.org/abs/2407.11722
The widespread use of large language models has brought up essential questions about the potential biases these models might learn. This led to the development of several metrics aimed at evaluating and mitigating these biases. In this paper, we firs
Externí odkaz:
http://arxiv.org/abs/2406.05918
Autor:
Thakkar, Megh, Fournier, Quentin, Riemer, Matthew D, Chen, Pin-Yu, Zouaq, Amal, Das, Payel, Chandar, Sarath
Large language models are first pre-trained on trillions of tokens and then instruction-tuned or aligned to specific preferences. While pre-training remains out of reach for most researchers due to the compute required, fine-tuning has become afforda
Externí odkaz:
http://arxiv.org/abs/2406.04879
Autor:
Zholus, Artem, Kuznetsov, Maksim, Schutski, Roman, Shayakhmetov, Rim, Polykovskiy, Daniil, Chandar, Sarath, Zhavoronkov, Alex
Generating novel active molecules for a given protein is an extremely challenging task for generative models that requires an understanding of the complex physical interactions between the molecule and its environment. In this paper, we present a nov
Externí odkaz:
http://arxiv.org/abs/2406.03686
The optimal model for a given task is often challenging to determine, requiring training multiple models from scratch which becomes prohibitive as dataset and model sizes grow. A more efficient alternative is to reuse smaller pre-trained models by ex
Externí odkaz:
http://arxiv.org/abs/2405.15895
Interpretability is the study of explaining models in understandable terms to humans. At present, interpretability is divided into two paradigms: the intrinsic paradigm, which believes that only models designed to be explained can be explained, and t
Externí odkaz:
http://arxiv.org/abs/2405.05386
While Large Language Models (LLMs) have demonstrated significant promise as agents in interactive tasks, their substantial computational requirements and restricted number of calls constrain their practical utility, especially in long-horizon interac
Externí odkaz:
http://arxiv.org/abs/2405.02749
In the real world, the strong episode resetting mechanisms that are needed to train agents in simulation are unavailable. The \textit{resetting} assumption limits the potential of reinforcement learning in the real world, as providing resets to an ag
Externí odkaz:
http://arxiv.org/abs/2405.01684