Zobrazeno 1 - 10
of 1 848
pro vyhledávání: '"Bietti, A."'
Autor:
Vichi, Stefano, Asahi, Shigeo, Bietti, Sergio, Tuktamyshev, Artur, Fedorov, Alexey, Kita, Takashi, Sanguinetti, Stefano
Long Wavelenght infrared devices, despite growing interest due to a wide range of applications in commercial, public, and academic sectors, are still struggling to achieve significant improvements over well-established technologies like HgCdTe detect
Externí odkaz:
http://arxiv.org/abs/2407.18302
In addition to the ability to generate fluent text in various languages, large language models have been successful at tasks that involve basic forms of logical "reasoning" over their context. Recent work found that selectively removing certain compo
Externí odkaz:
http://arxiv.org/abs/2406.03068
Autor:
Golkar, Siavash, Bietti, Alberto, Pettee, Mariel, Eickenberg, Michael, Cranmer, Miles, Hirashima, Keiya, Krawezik, Geraud, Lourie, Nicholas, McCabe, Michael, Morel, Rudy, Ohana, Ruben, Parker, Liam Holden, Blancard, Bruno Régaldo-Saint, Cho, Kyunghyun, Ho, Shirley
Transformers have revolutionized machine learning across diverse domains, yet understanding their behavior remains crucial, particularly in high-stakes applications. This paper introduces the contextual counting task, a novel toy problem aimed at enh
Externí odkaz:
http://arxiv.org/abs/2406.02585
We study level set teleportation, an optimization sub-routine which seeks to accelerate gradient methods by maximizing the gradient norm on a level-set of the objective function. Since the descent lemma implies that gradient descent (GD) decreases th
Externí odkaz:
http://arxiv.org/abs/2403.03362
Adam has been shown to outperform gradient descent on large language models by a larger margin than on other tasks, but it is unclear why. We show that a key factor in this performance gap is the heavy-tailed class imbalance found in language tasks.
Externí odkaz:
http://arxiv.org/abs/2402.19449
This work focuses on the training dynamics of one associative memory module storing outer products of token embeddings. We reduce this problem to the study of a system of particles, which interact according to properties of the data distribution and
Externí odkaz:
http://arxiv.org/abs/2402.18724
We study gradient flow on the multi-index regression problem for high-dimensional Gaussian data. Multi-index functions consist of a composition of an unknown low-rank linear projection and an arbitrary unknown, low-dimensional link function. As such,
Externí odkaz:
http://arxiv.org/abs/2310.19793
Autor:
Parker, Liam, Lanusse, Francois, Golkar, Siavash, Sarra, Leopoldo, Cranmer, Miles, Bietti, Alberto, Eickenberg, Michael, Krawezik, Geraud, McCabe, Michael, Ohana, Ruben, Pettee, Mariel, Blancard, Bruno Regaldo-Saint, Tesileanu, Tiberiu, Cho, Kyunghyun, Ho, Shirley
We present AstroCLIP, a single, versatile model that can embed both galaxy images and spectra into a shared, physically meaningful latent space. These embeddings can then be used - without any model fine-tuning - for a variety of downstream tasks inc
Externí odkaz:
http://arxiv.org/abs/2310.03024
Autor:
McCabe, Michael, Blancard, Bruno Régaldo-Saint, Parker, Liam Holden, Ohana, Ruben, Cranmer, Miles, Bietti, Alberto, Eickenberg, Michael, Golkar, Siavash, Krawezik, Geraud, Lanusse, Francois, Pettee, Mariel, Tesileanu, Tiberiu, Cho, Kyunghyun, Ho, Shirley
We introduce multiple physics pretraining (MPP), an autoregressive task-agnostic pretraining approach for physical surrogate modeling. MPP involves training large surrogate models to predict the dynamics of multiple heterogeneous physical systems sim
Externí odkaz:
http://arxiv.org/abs/2310.02994
Autor:
Golkar, Siavash, Pettee, Mariel, Eickenberg, Michael, Bietti, Alberto, Cranmer, Miles, Krawezik, Geraud, Lanusse, Francois, McCabe, Michael, Ohana, Ruben, Parker, Liam, Blancard, Bruno Régaldo-Saint, Tesileanu, Tiberiu, Cho, Kyunghyun, Ho, Shirley
Large Language Models have not yet been broadly adapted for the analysis of scientific datasets due in part to the unique difficulties of tokenizing numbers. We propose xVal, a numerical encoding scheme that represents any real number using just a si
Externí odkaz:
http://arxiv.org/abs/2310.02989