Zobrazeno 1 - 10
of 21
pro vyhledávání: '"Thekumparampil, Kiran Koshy"'
Autor:
Tang, Xun, Rahmanian, Holakou, Shavlovsky, Michael, Thekumparampil, Kiran Koshy, Xiao, Tesi, Ying, Lexing
Entropic optimal transport (OT) and the Sinkhorn algorithm have made it practical for machine learning practitioners to perform the fundamental task of calculating transport distance between statistical distributions. In this work, we focus on a gene
Externí odkaz:
http://arxiv.org/abs/2403.05054
Autor:
Tang, Xun, Shavlovsky, Michael, Rahmanian, Holakou, Tardini, Elisa, Thekumparampil, Kiran Koshy, Xiao, Tesi, Ying, Lexing
Computing the optimal transport distance between statistical distributions is a fundamental task in machine learning. One remarkable recent advancement is entropic regularization and the Sinkhorn algorithm, which utilizes only matrix scaling and guar
Externí odkaz:
http://arxiv.org/abs/2401.12253
The widespread practice of fine-tuning large language models (LLMs) on domain-specific data faces two major challenges in memory and privacy. First, as the size of LLMs continues to grow, the memory demands of gradient-based training methods via back
Externí odkaz:
http://arxiv.org/abs/2310.09639
Autor:
Hou, Charlie, Thekumparampil, Kiran Koshy, Shavlovsky, Michael, Fanti, Giulia, Dattatreya, Yesh, Sanghavi, Sujay
On tabular data, a significant body of literature has shown that current deep learning (DL) models perform at best similarly to Gradient Boosted Decision Trees (GBDTs), while significantly underperforming them on outlier data. However, these works of
Externí odkaz:
http://arxiv.org/abs/2308.00177
We study differentially private (DP) algorithms for smooth stochastic minimax optimization, with stochastic minimization as a byproduct. The holy grail of these settings is to guarantee the optimal trade-off between the privacy and the excess populat
Externí odkaz:
http://arxiv.org/abs/2206.00363
We study the bilinearly coupled minimax problem: $\min_{x} \max_{y} f(x) + y^\top A x - h(y)$, where $f$ and $h$ are both strongly convex smooth functions and admit first-order gradient oracles. Surprisingly, no known first-order algorithms have hith
Externí odkaz:
http://arxiv.org/abs/2201.07427
Meta-learning synthesizes and leverages the knowledge from a given set of tasks to rapidly learn new tasks using very little data. Meta-learning of linear regression tasks, where the regressors lie in a low-dimensional subspace, is an extensively-stu
Externí odkaz:
http://arxiv.org/abs/2105.08306
We consider the classical setting of optimizing a nonsmooth Lipschitz continuous convex function over a convex constraint set, when having access to a (stochastic) first-order oracle (FO) for the function and a projection oracle (PO) for the constrai
Externí odkaz:
http://arxiv.org/abs/2010.01848
This paper studies first order methods for solving smooth minimax optimization problems $\min_x \max_y g(x,y)$ where $g(\cdot,\cdot)$ is smooth and $g(x,\cdot)$ is concave for each $x$. In terms of $g(\cdot,y)$, we consider two settings -- strongly c
Externí odkaz:
http://arxiv.org/abs/1907.01543
Disentangled generative models map a latent code vector to a target space, while enforcing that a subset of the learned latent codes are interpretable and associated with distinct properties of the target distribution. Recent advances have been domin
Externí odkaz:
http://arxiv.org/abs/1906.06034