Výsledky vyhledávání - "Sander, Michael"

Report

How do Transformers perform In-Context Autoregressive Learning?

Autor: Sander, Michael E., Giryes, Raja, Suzuki, Taiji, Blondel, Mathieu, Peyré, Gabriel

Transformers have achieved state-of-the-art performance in language modeling tasks. However, the reasons behind their tremendous success are still unclear. In this paper, towards a better understanding, we train a Transformer model on a simple next t

Externí odkaz: http://arxiv.org/abs/2402.05787

Zobrazit plný text záznamu

Report

Implicit regularization of deep residual networks towards neural ODEs

Autor: Marion, Pierre, Wu, Yu-Han, Sander, Michael E., Biau, Gérard

Residual neural networks are state-of-the-art deep learning models. Their continuous-depth analog, neural ordinary differential equations (ODEs), are also widely used. Despite their success, the link between the discrete and continuous models still l

Externí odkaz: http://arxiv.org/abs/2309.01213

Zobrazit plný text záznamu

Report

Fast, Differentiable and Sparse Top-k: a Convex Analysis Perspective

Autor: Sander, Michael E., Puigcerver, Joan, Djolonga, Josip, Peyré, Gabriel, Blondel, Mathieu

The top-k operator returns a sparse vector, where the non-zero values correspond to the k largest values of the input. Unfortunately, because it is a discontinuous function, it is difficult to incorporate in neural networks trained end-to-end with ba

Externí odkaz: http://arxiv.org/abs/2302.01425

Zobrazit plný text záznamu

Report

Vision Transformers provably learn spatial structure

Autor: Jelassi, Samy, Sander, Michael E., Li, Yuanzhi

Vision Transformers (ViTs) have achieved comparable or superior performance than Convolutional Neural Networks (CNNs) in computer vision. This empirical breakthrough is even more remarkable since, in contrast to CNNs, ViTs do not embed any visual ind

Externí odkaz: http://arxiv.org/abs/2210.09221

Zobrazit plný text záznamu

Report

Do Residual Neural Networks discretize Neural Ordinary Differential Equations?

Autor: Sander, Michael E., Ablin, Pierre, Peyré, Gabriel

Neural Ordinary Differential Equations (Neural ODEs) are the continuous analog of Residual Neural Networks (ResNets). We investigate whether the discrete dynamics defined by a ResNet are close to the continuous one of a Neural ODE. We first quantify

Externí odkaz: http://arxiv.org/abs/2205.14612

Zobrazit plný text záznamu

Report

Sinkformers: Transformers with Doubly Stochastic Attention

Autor: Sander, Michael E., Ablin, Pierre, Blondel, Mathieu, Peyré, Gabriel

Attention based models such as Transformers involve pairwise interactions between data points, modeled with a learnable attention matrix. Importantly, this attention matrix is normalized with the SoftMax operator, which makes it row-wise stochastic.

Externí odkaz: http://arxiv.org/abs/2110.11773

Zobrazit plný text záznamu

Akademický článek

Perioperative Care in Cardiac Surgery: A Joint Consensus Statement by the Enhanced Recovery After Surgery (ERAS) Cardiac Society, ERAS International Society, and The Society of Thoracic Surgeons (STS)

Autor: Grant, Michael C., Crisafi, Cheryl, Alvarez, Adrian, Arora, Rakesh C., Brindle, Mary E., Chatterjee, Subhasis, Ender, Joerg, Fletcher, Nick, Gregory, Alexander J., Gunaydin, Serdar, Jahangiri, Marjan, Ljungqvist, Olle, Lobdell, Kevin W., Morton, Vicki, Reddy, V. Seenu, Salenger, Rawn, Sander, Michael, Zarbock, Alexander, Engelman, Daniel T.

Publikováno v: In The Annals of Thoracic Surgery April 2024 117(4):669-689

Zobrazit plný text záznamu

Report

Momentum Residual Neural Networks

Autor: Sander, Michael E., Ablin, Pierre, Blondel, Mathieu, Peyré, Gabriel

The training of deep residual neural networks (ResNets) with backpropagation has a memory cost that increases linearly with respect to the depth of the network. A way to circumvent this issue is to use reversible architectures. In this paper, we prop

Externí odkaz: http://arxiv.org/abs/2102.07870

Zobrazit plný text záznamu

Akademický článek

Tento výsledek nelze pro nepřihlášené uživatele zobrazit.
K zobrazení výsledku je třeba se přihlásit.

Akademický článek

Tento výsledek nelze pro nepřihlášené uživatele zobrazit.
K zobrazení výsledku je třeba se přihlásit.

Vyhledávací nástroje:

Upřesnit hledání