Výsledky vyhledávání

Report

Large Stepsize Gradient Descent for Non-Homogeneous Two-Layer Networks: Margin Improvement and Fast Optimization

Autor: Cai, Yuhang, Wu, Jingfeng, Mei, Song, Lindsey, Michael, Bartlett, Peter L.

The typical training of neural networks using large stepsize gradient descent (GD) under the logistic loss often involves two distinct phases, where the empirical risk oscillates in the first phase but decreases monotonically in the second phase. We

Externí odkaz: http://arxiv.org/abs/2406.08654

Zobrazit plný text záznamu

Report

U-Nets as Belief Propagation: Efficient Classification, Denoising, and Diffusion in Generative Hierarchical Models

Autor: Mei, Song

U-Nets are among the most widely used architectures in computer vision, renowned for their exceptional performance in applications such as image segmentation, denoising, and diffusion modeling. However, a theoretical explanation of the U-Net architec

Externí odkaz: http://arxiv.org/abs/2404.18444

Zobrazit plný text záznamu

Report

An Overview of Diffusion Models: Applications, Guided Generation, Statistical Rates and Optimization

Autor: Chen, Minshuo, Mei, Song, Fan, Jianqing, Wang, Mengdi

Diffusion models, a powerful and universal generative AI technology, have achieved tremendous success in computer vision, audio, reinforcement learning, and computational biology. In these applications, diffusion models provide flexible high-dimensio

Externí odkaz: http://arxiv.org/abs/2404.07771

Zobrazit plný text záznamu

Report

Negative Preference Optimization: From Catastrophic Collapse to Effective Unlearning

Autor: Zhang, Ruiqi, Lin, Licong, Bai, Yu, Mei, Song

Large Language Models (LLMs) often memorize sensitive, private, or copyrighted data during pre-training. LLM unlearning aims to eliminate the influence of undesirable data from the pre-trained model while preserving the model's utilities on other tas

Externí odkaz: http://arxiv.org/abs/2404.05868

Zobrazit plný text záznamu

Report

Statistical Estimation in the Spiked Tensor Model via the Quantum Approximate Optimization Algorithm

Autor: Zhou, Leo, Basso, Joao, Mei, Song

The quantum approximate optimization algorithm (QAOA) is a general-purpose algorithm for combinatorial optimization. In this paper, we analyze the performance of the QAOA on a statistical estimation problem, namely, the spiked tensor model, which exh

Externí odkaz: http://arxiv.org/abs/2402.19456

Zobrazit plný text záznamu

Report

Mean-field variational inference with the TAP free energy: Geometric and statistical properties in linear models

Autor: Celentano, Michael, Fan, Zhou, Lin, Licong, Mei, Song

We study mean-field variational inference in a Bayesian linear model when the sample size n is comparable to the dimension p. In high dimensions, the common approach of minimizing a Kullback-Leibler divergence from the posterior distribution, or maxi

Externí odkaz: http://arxiv.org/abs/2311.08442

Zobrazit plný text záznamu

Report

How Do Transformers Learn In-Context Beyond Simple Functions? A Case Study on Learning with Representations

Autor: Guo, Tianyu, Hu, Wei, Mei, Song, Wang, Huan, Xiong, Caiming, Savarese, Silvio, Bai, Yu

While large language models based on the transformer architecture have demonstrated remarkable in-context learning (ICL) capabilities, understandings of such capabilities are still in an early stage, where existing theory and mechanistic understandin

Externí odkaz: http://arxiv.org/abs/2310.10616

Zobrazit plný text záznamu

Report

Transformers as Decision Makers: Provable In-Context Reinforcement Learning via Supervised Pretraining

Autor: Lin, Licong, Bai, Yu, Mei, Song

Large transformer models pretrained on offline reinforcement learning datasets have demonstrated remarkable in-context reinforcement learning (ICRL) capabilities, where they can make good decisions when prompted with interaction trajectories from uns

Externí odkaz: http://arxiv.org/abs/2310.08566

Zobrazit plný text záznamu

Report

Deep Networks as Denoising Algorithms: Sample-Efficient Learning of Diffusion Models in High-Dimensional Graphical Models

Autor: Mei, Song, Wu, Yuchen

We investigate the approximation efficiency of score functions by deep neural networks in diffusion-based generative modeling. While existing approximation theories utilize the smoothness of score functions, they suffer from the curse of dimensionali

Externí odkaz: http://arxiv.org/abs/2309.11420

Zobrazit plný text záznamu

Report

Uncertainty Intervals for Prediction Errors in Time Series Forecasting

Autor: Xu, Hui, Mei, Song, Bates, Stephen, Taylor, Jonathan, Tibshirani, Robert

Inference for prediction errors is critical in time series forecasting pipelines. However, providing statistically meaningful uncertainty intervals for prediction errors remains relatively under-explored. Practitioners often resort to forward cross-v

Externí odkaz: http://arxiv.org/abs/2309.07435

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání