Výsledky vyhledávání - "Dhillon, Inderjit S"

Report

Geometric Median (GM) Matching for Robust Data Pruning

Autor: Acharya, Anish, Dhillon, Inderjit S, Sanghavi, Sujay

Data pruning, the combinatorial task of selecting a small and informative subset from a large dataset, is crucial for mitigating the enormous computational costs associated with training data-hungry modern deep learning models at scale. Since large-s

Externí odkaz: http://arxiv.org/abs/2406.17188

Zobrazit plný text záznamu

Report

Retraining with Predicted Hard Labels Provably Increases Model Accuracy

Autor: Das, Rudrajit, Dhillon, Inderjit S., Epasto, Alessandro, Javanmard, Adel, Mao, Jieming, Mirrokni, Vahab, Sanghavi, Sujay, Zhong, Peilin

The performance of a model trained with \textit{noisy labels} is often improved by simply \textit{retraining} the model with its own predicted \textit{hard} labels (i.e., $1$/$0$ labels). Yet, a detailed theoretical characterization of this phenomeno

Externí odkaz: http://arxiv.org/abs/2406.11206

Zobrazit plný text záznamu

Report

Towards Quantifying the Preconditioning Effect of Adam

Autor: Das, Rudrajit, Agarwal, Naman, Sanghavi, Sujay, Dhillon, Inderjit S.

There is a notable dearth of results characterizing the preconditioning effect of Adam and showing how it may alleviate the curse of ill-conditioning -- an issue plaguing gradient descent (GD). In this work, we perform a detailed analysis of Adam's p

Externí odkaz: http://arxiv.org/abs/2402.07114

Zobrazit plný text záznamu

Report

Automatic Engineering of Long Prompts

Autor: Hsieh, Cho-Jui, Si, Si, Yu, Felix X., Dhillon, Inderjit S.

Large language models (LLMs) have demonstrated remarkable capabilities in solving complex open-domain tasks, guided by comprehensive instructions and demonstrations provided in the form of prompts. However, these prompts can be lengthy, often compris

Externí odkaz: http://arxiv.org/abs/2311.10117

Zobrazit plný text záznamu

Report

Two-stage LLM Fine-tuning with Less Specialization and More Generalization

Autor: Wang, Yihan, Si, Si, Li, Daliang, Lukasik, Michal, Yu, Felix, Hsieh, Cho-Jui, Dhillon, Inderjit S, Kumar, Sanjiv

Pretrained large language models (LLMs) are general purpose problem solvers applicable to a diverse set of tasks with prompts. They can be further improved towards a specific task by fine-tuning on a specialized dataset. However, fine-tuning usually

Externí odkaz: http://arxiv.org/abs/2211.00635

Zobrazit plný text záznamu

Report

ELIAS: End-to-End Learning to Index and Search in Large Output Spaces

Autor: Gupta, Nilesh, Chen, Patrick H., Yu, Hsiang-Fu, Hsieh, Cho-Jui, Dhillon, Inderjit S

Extreme multi-label classification (XMC) is a popular framework for solving many real-world problems that require accurate prediction from a very large number of potential output choices. A popular approach for dealing with the large label space is t

Externí odkaz: http://arxiv.org/abs/2210.08410

Zobrazit plný text záznamu

Report

FINGER: Fast Inference for Graph-based Approximate Nearest Neighbor Search

Autor: Chen, Patrick H., Wei-cheng, Chang, Hsiang-fu, Yu, Dhillon, Inderjit S., Cho-jui, Hsieh

Approximate K-Nearest Neighbor Search (AKNNS) has now become ubiquitous in modern applications, for example, as a fast search procedure with two tower deep learning models. Graph-based methods for AKNNS in particular have received great attention due

Externí odkaz: http://arxiv.org/abs/2206.11408

Zobrazit plný text záznamu

Report

Counterfactual Learning To Rank for Utility-Maximizing Query Autocompletion

Autor: Block, Adam, Kidambi, Rahul, Hill, Daniel N., Joachims, Thorsten, Dhillon, Inderjit S.

Conventional methods for query autocompletion aim to predict which completed query a user will select from a list. A shortcoming of this approach is that users often do not know which query will provide the best retrieval performance on the current i

Externí odkaz: http://arxiv.org/abs/2204.10936

Zobrazit plný text záznamu

Report

Sample Efficiency of Data Augmentation Consistency Regularization

Autor: Yang, Shuo, Dong, Yijun, Ward, Rachel, Dhillon, Inderjit S., Sanghavi, Sujay, Lei, Qi

Data augmentation is popular in the training of large neural networks; currently, however, there is no clear theoretical comparison between different algorithmic choices on how to use augmented data. In this paper, we take a step in this direction -

Externí odkaz: http://arxiv.org/abs/2202.12230

Zobrazit plný text záznamu

Report

Node Feature Extraction by Self-Supervised Multi-scale Neighborhood Prediction

Autor: Chien, Eli, Chang, Wei-Cheng, Hsieh, Cho-Jui, Yu, Hsiang-Fu, Zhang, Jiong, Milenkovic, Olgica, Dhillon, Inderjit S

Learning on graphs has attracted significant attention in the learning community due to numerous real-world applications. In particular, graph neural networks (GNNs), which take numerical node features and graph structure as inputs, have been shown t

Externí odkaz: http://arxiv.org/abs/2111.00064

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání