Výsledky vyhledávání - "Zhong, Peilin"

Report

Addax: Utilizing Zeroth-Order Gradients to Improve Memory Efficiency and Performance of SGD for Fine-Tuning Language Models

Autor: Li, Zeman, Zhang, Xinwei, Zhong, Peilin, Deng, Yuan, Razaviyayn, Meisam, Mirrokni, Vahab

Fine-tuning language models (LMs) with the Adam optimizer often demands excessive memory, limiting accessibility. The "in-place" version of Stochastic Gradient Descent (IP-SGD) and Memory-Efficient Zeroth-order Optimizer (MeZO) have been proposed to

Externí odkaz: http://arxiv.org/abs/2410.06441

Zobrazit plný text záznamu

Report

Massively Parallel Minimum Spanning Tree in General Metric Spaces

Autor: Azarmehr, Amir, Behnezhad, Soheil, Jayaram, Rajesh, Łącki, Jakub, Mirrokni, Vahab, Zhong, Peilin

We study the minimum spanning tree (MST) problem in the massively parallel computation (MPC) model. Our focus is particularly on the *strictly sublinear* regime of MPC where the space per machine is $O(n^\delta)$. Here $n$ is the number of vertices a

Externí odkaz: http://arxiv.org/abs/2408.06455

Zobrazit plný text záznamu

Report

Retraining with Predicted Hard Labels Provably Increases Model Accuracy

Autor: Das, Rudrajit, Dhillon, Inderjit S., Epasto, Alessandro, Javanmard, Adel, Mao, Jieming, Mirrokni, Vahab, Sanghavi, Sujay, Zhong, Peilin

The performance of a model trained with \textit{noisy labels} is often improved by simply \textit{retraining} the model with its own predicted \textit{hard} labels (i.e., $1$/$0$ labels). Yet, a detailed theoretical characterization of this phenomeno

Externí odkaz: http://arxiv.org/abs/2406.11206

Zobrazit plný text záznamu

Report

Perturb-and-Project: Differentially Private Similarities and Marginals

Autor: Cohen-Addad, Vincent, d'Orsi, Tommaso, Epasto, Alessandro, Mirrokni, Vahab, Zhong, Peilin

We revisit the input perturbations framework for differential privacy where noise is added to the input $A\in \mathcal{S}$ and the result is then projected back to the space of admissible datasets $\mathcal{S}$. Through this framework, we first desig

Externí odkaz: http://arxiv.org/abs/2406.04868

Zobrazit plný text záznamu

Report

High-Dimensional Geometric Streaming for Nearly Low Rank Data

Autor: Esfandiari, Hossein, Mirrokni, Vahab, Kacham, Praneeth, Woodruff, David P., Zhong, Peilin

We study streaming algorithms for the $\ell_p$ subspace approximation problem. Given points $a_1, \ldots, a_n$ as an insertion-only stream and a rank parameter $k$, the $\ell_p$ subspace approximation problem is to find a $k$-dimensional subspace $V$

Externí odkaz: http://arxiv.org/abs/2406.02910

Zobrazit plný text záznamu

Report

Optimal Communication for Classic Functions in the Coordinator Model and Beyond

Autor: Esfandiari, Hossein, Kacham, Praneeth, Mirrokni, Vahab, Woodruff, David P., Zhong, Peilin

In the coordinator model of communication with $s$ servers, given an arbitrary non-negative function $f$, we study the problem of approximating the sum $\sum_{i \in [n]}f(x_i)$ up to a $1 \pm \varepsilon$ factor. Here the vector $x \in R^n$ is define

Externí odkaz: http://arxiv.org/abs/2403.20307

Zobrazit plný text záznamu

Report

Parallel Approximate Maximum Flows in Near-Linear Work and Polylogarithmic Depth

Autor: Agarwal, Arpit, Khanna, Sanjeev, Li, Huan, Patil, Prathamesh, Wang, Chen, White, Nathan, Zhong, Peilin

We present a parallel algorithm for the $(1-\epsilon)$-approximate maximum flow problem in capacitated, undirected graphs with $n$ vertices and $m$ edges, achieving $O(\epsilon^{-3}\text{polylog} n)$ depth and $O(m \epsilon^{-3} \text{polylog} n)$ wo

Externí odkaz: http://arxiv.org/abs/2402.14950

Zobrazit plný text záznamu

Report

Near-Optimal $k$-Clustering in the Sliding Window Model

Autor: Woodruff, David P., Zhong, Peilin, Zhou, Samson

Clustering is an important technique for identifying structural information in large-scale data analysis, where the underlying dataset may be too large to store. In many applications, recent data can provide more accurate information and thus older d

Externí odkaz: http://arxiv.org/abs/2311.00642

Zobrazit plný text záznamu

Report

PolySketchFormer: Fast Transformers via Sketching Polynomial Kernels

Autor: Kacham, Praneeth, Mirrokni, Vahab, Zhong, Peilin

The quadratic time and memory complexity inherent to self-attention mechanisms, with respect to sequence length, presents a critical computational bottleneck in the training and deployment of large-scale Transformer-based language models. Recent theo

Externí odkaz: http://arxiv.org/abs/2310.01655

Zobrazit plný text záznamu

Report

Massively Parallel Algorithms for High-Dimensional Euclidean Minimum Spanning Tree

Autor: Jayaram, Rajesh, Mirrokni, Vahab, Narayanan, Shyam, Zhong, Peilin

We study the classic Euclidean Minimum Spanning Tree (MST) problem in the Massively Parallel Computation (MPC) model. Given a set $X \subset \mathbb{R}^d$ of $n$ points, the goal is to produce a spanning tree for $X$ with weight within a small factor

Externí odkaz: http://arxiv.org/abs/2308.00503

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání