Zobrazeno 1 - 9
of 9
pro vyhledávání: '"Liao, Fangshuo"'
Principal Component Analysis (PCA) aims to find subspaces spanned by the so-called principal components that best represent the variance in the dataset. The deflation method is a popular meta-algorithm that sequentially finds individual principal com
Externí odkaz:
http://arxiv.org/abs/2310.04283
Autor:
Liao, Fangshuo, Kyrillidis, Anastasios
Current state-of-the-art analyses on the convergence of gradient descent for training neural networks focus on characterizing properties of the loss landscape, such as the Polyak-Lojaciewicz (PL) condition and the restricted strong convexity. While g
Externí odkaz:
http://arxiv.org/abs/2306.08109
Autor:
Liu, Zichang, Desai, Aditya, Liao, Fangshuo, Wang, Weitao, Xie, Victor, Xu, Zhaozhuo, Kyrillidis, Anastasios, Shrivastava, Anshumali
Large language models(LLMs) have sparked a new wave of exciting AI applications. Hosting these models at scale requires significant memory resources. One crucial memory bottleneck for the deployment stems from the context window. It is commonly recog
Externí odkaz:
http://arxiv.org/abs/2305.17118
The strong Lottery Ticket Hypothesis (LTH) claims the existence of a subnetwork in a sufficiently large, randomly initialized neural network that approximates some target neural network without the need of training. We extend the theoretical guarante
Externí odkaz:
http://arxiv.org/abs/2210.16589
Recent work on the Lottery Ticket Hypothesis (LTH) shows that there exist ``\textit{winning tickets}'' in large neural networks. These tickets represent ``sparse'' versions of the full model that can be trained independently to achieve comparable acc
Externí odkaz:
http://arxiv.org/abs/2210.16169
Autor:
Liao, Fangshuo, Kyrillidis, Anastasios
With the motive of training all the parameters of a neural network, we study why and when one can achieve this by iteratively creating, training, and combining randomly selected subnetworks. Such scenarios have either implicitly or explicitly emerged
Externí odkaz:
http://arxiv.org/abs/2112.02668
Neural network pruning is useful for discovering efficient, high-performing subnetworks within pre-trained, dense network architectures. More often than not, it involves a three-step process -- pre-training, pruning, and re-training -- that is comput
Externí odkaz:
http://arxiv.org/abs/2108.00259
Autor:
Wolfe, Cameron R., Yang, Jingkang, Liao, Fangshuo, Chowdhury, Arindam, Dun, Chen, Bayer, Artun, Segarra, Santiago, Kyrillidis, Anastasios
Publikováno v:
Journal of Applied & Computational Topology; Oct2024, Vol. 8 Issue 5, p1363-1415, 53p
Autor:
Liao, Fangshuo, Kyrillidis, Anastasios
Current state-of-the-art analyses on the convergence of gradient descent for training neural networks focus on characterizing properties of the loss landscape, such as the Polyak-Lojaciewicz (PL) condition and the restricted strong convexity. While g
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::1592ba64a74a5b8313210937b15287e4