Výsledky vyhledávání

Report

Sampling from the Mean-Field Stationary Distribution

Autor: Kook, Yunbum, Zhang, Matthew S., Chewi, Sinho, Erdogdu, Murat A., Li, Mufan Bill

We study the complexity of sampling from the stationary distribution of a mean-field SDE, or equivalently, the complexity of minimizing a functional over the space of probability measures which includes an interaction term. Our main insight is to dec

Externí odkaz: http://arxiv.org/abs/2402.07355

Zobrazit plný text záznamu

Report

Differential Equation Scaling Limits of Shaped and Unshaped Neural Networks

Autor: Li, Mufan Bill, Nica, Mihai

Recent analyses of neural networks with shaped activations (i.e. the activation function is scaled as the network size grows) have led to scaling limits described by differential equations. However, these results do not a priori tell us anything abou

Externí odkaz: http://arxiv.org/abs/2310.12079

Zobrazit plný text záznamu

Report

Depthwise Hyperparameter Transfer in Residual Networks: Dynamics and Scaling Limit

Autor: Bordelon, Blake, Noci, Lorenzo, Li, Mufan Bill, Hanin, Boris, Pehlevan, Cengiz

The cost of hyperparameter tuning in deep learning has been rising with model sizes, prompting practitioners to find new tuning methods using a proxy of smaller networks. One such proposal uses $\mu$P parameterized networks, where the optimal hyperpa

Externí odkaz: http://arxiv.org/abs/2309.16620

Zobrazit plný text záznamu

Report

The Shaped Transformer: Attention Models in the Infinite Depth-and-Width Limit

Autor: Noci, Lorenzo, Li, Chuning, Li, Mufan Bill, He, Bobby, Hofmann, Thomas, Maddison, Chris, Roy, Daniel M.

In deep learning theory, the covariance matrix of the representations serves as a proxy to examine the network's trainability. Motivated by the success of Transformers, we study the covariance matrix of a modified Softmax-based attention model with s

Externí odkaz: http://arxiv.org/abs/2306.17759

Zobrazit plný text záznamu

Report

Improved Discretization Analysis for Underdamped Langevin Monte Carlo

Autor: Zhang, Matthew, Chewi, Sinho, Li, Mufan Bill, Balasubramanian, Krishnakumar, Erdogdu, Murat A.

Underdamped Langevin Monte Carlo (ULMC) is an algorithm used to sample from unnormalized densities by leveraging the momentum of a particle moving in a potential well. We provide a novel analysis of ULMC, motivated by two central questions: (1) Can w

Externí odkaz: http://arxiv.org/abs/2302.08049

Zobrazit plný text záznamu

Report

The Neural Covariance SDE: Shaped Infinite Depth-and-Width Networks at Initialization

Autor: Li, Mufan Bill, Nica, Mihai, Roy, Daniel M.

The logit outputs of a feedforward neural network at initialization are conditionally Gaussian, given a random covariance matrix defined by the penultimate layer. In this work, we study the distribution of this random matrix. Recent work has shown th

Externí odkaz: http://arxiv.org/abs/2206.02768

Zobrazit plný text záznamu

Report

Acceleration of Gossip Algorithms through the Euler-Poisson-Darboux Equation

Autor: Berthier, Raphaël, Li, Mufan

Gossip algorithms and their accelerated versions have been studied exclusively in discrete time on graphs. In this work, we take a different approach, and consider the scaling limit of gossip algorithms in both large graphs and large number of iterat

Externí odkaz: http://arxiv.org/abs/2202.10742

Zobrazit plný text záznamu

Report

Analysis of Langevin Monte Carlo from Poincar\'e to Log-Sobolev

Autor: Chewi, Sinho, Erdogdu, Murat A., Li, Mufan Bill, Shen, Ruoqi, Zhang, Matthew

Classically, the continuous-time Langevin diffusion converges exponentially fast to its stationary distribution $\pi$ under the sole assumption that $\pi$ satisfies a Poincar\'e inequality. Using this fact to provide guarantees for the discrete-time

Externí odkaz: http://arxiv.org/abs/2112.12662

Zobrazit plný text záznamu

Akademický článek

Effect of chronic diseases on willingness to receive the second COVID-19 vaccine booster dose among cancer patients: A multicenter cross-sectional survey in China

Autor: Li, Mufan, Ren, Yizhou, Liu, Ping, Wang, Jiayu, Wang, Ying, Xu, Junjie, Yang, Jianzhou

Publikováno v: In AJIC: American Journal of Infection Control May 2024 52(5):533-540

Zobrazit plný text záznamu

Report

The Future is Log-Gaussian: ResNets and Their Infinite-Depth-and-Width Limit at Initialization

Autor: Li, Mufan Bill, Nica, Mihai, Roy, Daniel M.

Theoretical results show that neural networks can be approximated by Gaussian processes in the infinite-width limit. However, for fully connected networks, it has been previously shown that for any fixed network width, $n$, the Gaussian approximation

Externí odkaz: http://arxiv.org/abs/2106.04013

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání