Výsledky vyhledávání - "SANKARARAMAN, KARTHIK"

Report

Preference Optimization with Multi-Sample Comparisons

Autor: Wang, Chaoqi, Zhao, Zhuokai, Zhu, Chen, Sankararaman, Karthik Abinav, Valko, Michal, Cao, Xuefei, Chen, Zhaorun, Khabsa, Madian, Chen, Yuxin, Ma, Hao, Wang, Sinong

Recent advancements in generative models, particularly large language models (LLMs) and diffusion models, have been driven by extensive pretraining on large datasets followed by post-training. However, current post-training methods such as reinforcem

Externí odkaz: http://arxiv.org/abs/2410.12138

Zobrazit plný text záznamu

Report

The Perfect Blend: Redefining RLHF with Mixture of Judges

Reinforcement learning from human feedback (RLHF) has become the leading approach for fine-tuning large language models (LLM). However, RLHF has limitations in multi-task learning (MTL) due to challenges of reward hacking and extreme multi-objective

Externí odkaz: http://arxiv.org/abs/2409.20370

Zobrazit plný text záznamu

Report

On the Equivalence of Graph Convolution and Mixup

Autor: Han, Xiaotian, Zeng, Hanqing, Chen, Yu, Nie, Shaoliang, Liu, Jingzhou, Narang, Kanika, Shakeri, Zahra, Sankararaman, Karthik Abinav, Jiang, Song, Khabsa, Madian, Wang, Qifan, Hu, Xia

This paper investigates the relationship between graph convolution and Mixup techniques. Graph convolution in a graph neural network involves aggregating features from neighboring samples to learn representative features for a specific node or sample

Externí odkaz: http://arxiv.org/abs/2310.00183

Zobrazit plný text záznamu

Report

Effective Long-Context Scaling of Foundation Models

We present a series of long-context LLMs that support effective context windows of up to 32,768 tokens. Our model series are built through continual pretraining from Llama 2 with longer training sequences and on a dataset where long texts are upsampl

Externí odkaz: http://arxiv.org/abs/2309.16039

Zobrazit plný text záznamu

Report

Rethinking Incentives in Recommender Systems: Are Monotone Rewards Always Beneficial?

Autor: Yao, Fan, Li, Chuanhao, Sankararaman, Karthik Abinav, Liao, Yiming, Zhu, Yan, Wang, Qifan, Wang, Hongning, Xu, Haifeng

The past decade has witnessed the flourishing of a new profession as media content creators, who rely on revenue streams from online content recommendation platforms. The reward mechanism employed by these platforms creates a competitive environment

Externí odkaz: http://arxiv.org/abs/2306.07893

Zobrazit plný text záznamu

Report

Contextual Bandits with Packing and Covering Constraints: A Modular Lagrangian Approach via Regression

Autor: Slivkins, Aleksandrs, Zhou, Xingyu, Sankararaman, Karthik Abinav, Foster, Dylan J.

We consider contextual bandits with linear constraints (CBwLC), a variant of contextual bandits in which the algorithm consumes multiple resources subject to linear constraints on total consumption. This problem generalizes contextual bandits with kn

Externí odkaz: http://arxiv.org/abs/2211.07484

Zobrazit plný text záznamu

Report

Improved Adaptive Algorithm for Scalable Active Learning with Weak Labeler

Autor: Chen, Yifang, Sankararaman, Karthik, Lazaric, Alessandro, Pirotta, Matteo, Karamshuk, Dmytro, Wang, Qifan, Mandyam, Karishma, Wang, Sinong, Fang, Han

Active learning with strong and weak labelers considers a practical setting where we have access to both costly but accurate strong labelers and inaccurate but cheap predictions provided by weak labelers. We study this problem in the streaming settin

Externí odkaz: http://arxiv.org/abs/2211.02233

Zobrazit plný text záznamu

Report

BayesFormer: Transformer with Uncertainty Estimation

Autor: Sankararaman, Karthik Abinav, Wang, Sinong, Fang, Han

Transformer has become ubiquitous due to its dominant performance in various NLP and image processing tasks. However, it lacks understanding of how to generate mathematically grounded uncertainty estimates for transformer architectures. Models equipp

Externí odkaz: http://arxiv.org/abs/2206.00826

Zobrazit plný text záznamu

Report

Online minimum matching with uniform metric and random arrivals

Autor: Duppala, Sharmila, Sankararaman, Karthik A., Xu, Pan

We consider Online Minimum Bipartite Matching under the uniform metric. We show that Randomized Greedy achieves a competitive ratio equal to $(1+1/n) (H_{n+1}-1)$, which matches the lower bound. Comparing with the fact that RG achieves an optimal rat

Externí odkaz: http://arxiv.org/abs/2112.05247

Zobrazit plný text záznamu

Report

Matching Algorithms for Blood Donation

Autor: McElfresh, Duncan C, Kroer, Christian, Pupyrev, Sergey, Sodomka, Eric, Sankararaman, Karthik, Chauvin, Zack, Dexter, Neil, Dickerson, John P

Global demand for donated blood far exceeds supply, and unmet need is greatest in low- and middle-income countries; experts suggest that large-scale coordination is necessary to alleviate demand. Using the Facebook Blood Donation tool, we conduct the

Externí odkaz: http://arxiv.org/abs/2108.04862

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání