Zobrazeno 1 - 10
of 820
pro vyhledávání: '"Hsu, Daniel"'
Weakly supervised learning aims to reduce the cost of labeling data by using expert-designed labeling rules. However, existing methods require experts to design effective rules in a single shot, which is difficult in the absence of proper guidance an
Externí odkaz:
http://arxiv.org/abs/2409.05199
A simple communication complexity argument proves that no one-layer transformer can solve the induction heads task unless its size is exponentially larger than the size sufficient for a two-layer transformer.
Externí odkaz:
http://arxiv.org/abs/2408.14332
The transformer architecture has prevailed in various deep learning settings due to its exceptional capabilities to select and compose structural information. Motivated by these capabilities, Sanford et al. proposed the sparse token selection task, i
Externí odkaz:
http://arxiv.org/abs/2406.06893
We study the problem of online multi-group learning, a learning model in which an online learner must simultaneously achieve small prediction regret on a large collection of (possibly overlapping) subsequences corresponding to a family of groups. Gro
Externí odkaz:
http://arxiv.org/abs/2406.05287
Restaurants are critical venues at which to investigate foodborne illness outbreaks due to shared sourcing, preparation, and distribution of foods. Formal channels to report illness after food consumption, such as 311, New York City's non-emergency m
Externí odkaz:
http://arxiv.org/abs/2405.06138
We show that a constant number of self-attention layers can efficiently simulate, and be simulated by, a constant number of communication rounds of Massively Parallel Computation. As a consequence, we show that logarithmic depth is sufficient for tra
Externí odkaz:
http://arxiv.org/abs/2402.09268
Autor:
Deng, Samuel, Hsu, Daniel
The multi-group learning model formalizes the learning scenario in which a single predictor must generalize well on multiple, possibly overlapping subgroups of interest. We extend the study of multi-group learning to the natural case where the groups
Externí odkaz:
http://arxiv.org/abs/2402.00258
We study the problem of auditing classifiers with the notion of statistical subgroup fairness. Kearns et al. (2018) has shown that the problem of auditing combinatorial subgroups fairness is as hard as agnostic learning. Essentially all work on remed
Externí odkaz:
http://arxiv.org/abs/2401.16439
We consider the problem of sufficient dimension reduction (SDR) for multi-index models. The estimators of the central mean subspace in prior works either have slow (non-parametric) convergence rates, or rely on stringent distributional conditions (e.
Externí odkaz:
http://arxiv.org/abs/2312.15469
Autor:
Hsu, Daniel, Mazumdar, Arya
The logistic regression model is one of the most popular data generation model in noisy binary classification problems. In this work, we study the sample complexity of estimating the parameters of the logistic regression model up to a given $\ell_2$
Externí odkaz:
http://arxiv.org/abs/2307.04191