Výsledky vyhledávání - "Pehlevan, Cengiz"

Report

The Optimization Landscape of SGD Across the Feature Learning Strength

Autor: Atanasov, Alexander, Meterez, Alexandru, Simon, James B., Pehlevan, Cengiz

We consider neural networks (NNs) where the final layer is down-scaled by a fixed hyperparameter $\gamma$. Recent work has identified $\gamma$ as controlling the strength of feature learning. As $\gamma$ increases, network evolution changes from "laz

Externí odkaz: http://arxiv.org/abs/2410.04642

Zobrazit plný text záznamu

Report

A Brain-Inspired Regularizer for Adversarial Robustness

Autor: Attias, Elie, Pehlevan, Cengiz, Obeid, Dina

Convolutional Neural Networks (CNNs) excel in many visual tasks, but they tend to be sensitive to slight input perturbations that are imperceptible to the human eye, often resulting in task failures. Recent studies indicate that training CNNs with re

Externí odkaz: http://arxiv.org/abs/2410.03952

Zobrazit plný text záznamu

Report

How Feature Learning Can Improve Neural Scaling Laws

Autor: Bordelon, Blake, Atanasov, Alexander, Pehlevan, Cengiz

We develop a solvable model of neural scaling laws beyond the kernel limit. Theoretical analysis of this model shows how performance scales with model size, training time, and the total amount of available data. We identify three scaling regimes corr

Externí odkaz: http://arxiv.org/abs/2409.17858

Zobrazit plný text záznamu

Report

Risk and cross validation in ridge regression with correlated samples

Autor: Atanasov, Alexander, Zavatone-Veth, Jacob A., Pehlevan, Cengiz

Recent years have seen substantial advances in our understanding of high-dimensional ridge regression, but existing theories assume that training examples are independent. By leveraging recent techniques from random matrix theory and free probability

Externí odkaz: http://arxiv.org/abs/2408.04607

Zobrazit plný text záznamu

Report

Nadaraya-Watson kernel smoothing as a random energy model

Autor: Zavatone-Veth, Jacob A., Pehlevan, Cengiz

We investigate the behavior of the Nadaraya-Watson kernel smoothing estimator in high dimensions using its relationship to the random energy model and to dense associative memories.
Comment: 9 pages, 3 figures

Externí odkaz: http://arxiv.org/abs/2408.03769

Zobrazit plný text záznamu

Report

Convex Relaxation for Solving Large-Margin Classifiers in Hyperbolic Space

Autor: Yang, Sheng, Liu, Peihan, Pehlevan, Cengiz

Hyperbolic spaces have increasingly been recognized for their outstanding performance in handling data with inherent hierarchical structures compared to their Euclidean counterparts. However, learning in hyperbolic spaces poses significant challenges

Externí odkaz: http://arxiv.org/abs/2405.17198

Zobrazit plný text záznamu

Report

Spectral regularization for adversarially-robust representation learning

Autor: Yang, Sheng, Zavatone-Veth, Jacob A., Pehlevan, Cengiz

The vulnerability of neural network classifiers to adversarial attacks is a major obstacle to their deployment in safety-critical applications. Regularization of network parameters during training can be used to improve adversarial robustness and gen

Externí odkaz: http://arxiv.org/abs/2405.17181

Zobrazit plný text záznamu

Report

Infinite Limits of Multi-head Transformer Dynamics

Autor: Bordelon, Blake, Chaudhry, Hamza Tahir, Pehlevan, Cengiz

In this work, we analyze various scaling limits of the training dynamics of transformer models in the feature learning regime. We identify the set of parameterizations that admit well-defined infinite width and depth limits, allowing the attention la

Externí odkaz: http://arxiv.org/abs/2405.15712

Zobrazit plný text záznamu

Report

MLPs Learn In-Context on Regression and Classification Tasks

Autor: Tong, William L., Pehlevan, Cengiz

In-context learning (ICL), the remarkable ability to solve a task from only input exemplars, is often assumed to be a unique hallmark of Transformer models. By examining commonly employed synthetic ICL tasks, we demonstrate that multi-layer perceptro

Externí odkaz: http://arxiv.org/abs/2405.15618

Zobrazit plný text záznamu

Report

Asymptotic theory of in-context learning by linear attention

Autor: Lu, Yue M., Letey, Mary I., Zavatone-Veth, Jacob A., Maiti, Anindita, Pehlevan, Cengiz

Transformers have a remarkable ability to learn and execute tasks based on examples provided within the input itself, without explicit prior training. It has been argued that this capability, known as in-context learning (ICL), is a cornerstone of Tr

Externí odkaz: http://arxiv.org/abs/2405.11751

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání