Zobrazeno 1 - 5
of 5
pro vyhledávání: '"Paliotta, Daniele"'
Linear RNN architectures, like Mamba, can be competitive with Transformer models in language modeling while having advantageous deployment characteristics. Given the focus on training large-scale Transformer models, we consider the challenge of conve
Externí odkaz:
http://arxiv.org/abs/2408.15237
Outlier Features (OFs) are neurons whose activation magnitudes significantly exceed the average over a neural network's (NN) width. They are well known to emerge during standard transformer training and have the undesirable effect of hindering quanti
Externí odkaz:
http://arxiv.org/abs/2405.19279
Transformer-based language models have found many diverse applications requiring them to process sequences of increasing length. For these applications, the causal self-attention -- which is the only component scaling quadratically w.r.t. the sequenc
Externí odkaz:
http://arxiv.org/abs/2306.01160
We present the Graph Forward-Forward (GFF) algorithm, an extension of the Forward-Forward procedure to graphs, able to handle features distributed over a graph's nodes. This allows training graph neural networks with forward passes only, without back
Externí odkaz:
http://arxiv.org/abs/2302.05282
Autor:
Sinha, Atul Kumar, Paliotta, Daniele, Máté, Bálint, Pina-Otey, Sebastian, Raine, John A., Golling, Tobias, Fleuret, François
Deep learning methods have gained popularity in high energy physics for fast modeling of particle showers in detectors. Detailed simulation frameworks such as the gold standard Geant4 are computationally intensive, and current deep generative archite
Externí odkaz:
http://arxiv.org/abs/2202.05012