Towards a Statistical Understanding of Neural Networks: Beyond the Neural Tangent Kernel Theories

Autor:	Zhang, Haobo, Lai, Jianfa, Li, Yicheng, Lin, Qian, Liu, Jun S.
Rok vydání:	2024
Předmět:	Computer Science - Machine Learning Mathematics - Statistics Theory
Druh dokumentu:	Working Paper
Popis:	A primary advantage of neural networks lies in their feature learning characteristics, which is challenging to theoretically analyze due to the complexity of their training dynamics. We propose a new paradigm for studying feature learning and the resulting benefits in generalizability. After reviewing the neural tangent kernel (NTK) theory and recent results in kernel regression, which address the generalization issue of sufficiently wide neural networks, we examine limitations and implications of the fixed kernel theory (as the NTK theory) and review recent theoretical advancements in feature learning. Moving beyond the fixed kernel/feature theory, we consider neural networks as adaptive feature models. Finally, we propose an over-parameterized Gaussian sequence model as a prototype model to study the feature learning characteristics of neural networks.
Databáze:	arXiv
Externí odkaz:	http://arxiv.org/abs/2412.18756 Zobrazit plný text záznamu View this record from Arxiv