Výsledky vyhledávání - "Shen, Lingfeng"

Report

It Takes Two: On the Seamlessness between Reward and Policy Model in RLHF

Autor: Lu, Taiming, Shen, Lingfeng, Yang, Xinyu, Tan, Weiting, Chen, Beidi, Yao, Huaxiu

Reinforcement Learning from Human Feedback (RLHF) involves training policy models (PMs) and reward models (RMs) to align language models with human preferences. Instead of focusing solely on PMs and RMs independently, we propose to examine their inte

Externí odkaz: http://arxiv.org/abs/2406.07971

Zobrazit plný text záznamu

Report

DiffNorm: Self-Supervised Normalization for Non-autoregressive Speech-to-speech Translation

Autor: Tan, Weiting, Zhang, Jingyu, Shen, Lingfeng, Khashabi, Daniel, Koehn, Philipp

Non-autoregressive Transformers (NATs) are recently applied in direct speech-to-speech translation systems, which convert speech across different languages without intermediate text data. Although NATs generate high-quality outputs and offer faster i

Externí odkaz: http://arxiv.org/abs/2405.13274

Zobrazit plný text záznamu

Report

AnaloBench: Benchmarking the Identification of Abstract and Long-context Analogies

Autor: Ye, Xiao, Wang, Andrew, Choi, Jacob, Lu, Yining, Sharma, Shreya, Shen, Lingfeng, Tiyyala, Vijay, Andrews, Nicholas, Khashabi, Daniel

Humans regularly engage in analogical thinking, relating personal experiences to current situations (X is analogous to Y because of Z). Analogical thinking allows humans to solve problems in creative ways, grasp difficult concepts, and articulate ide

Externí odkaz: http://arxiv.org/abs/2402.12370

Zobrazit plný text záznamu

Report

The Language Barrier: Dissecting Safety Challenges of LLMs in Multilingual Contexts

Autor: Shen, Lingfeng, Tan, Weiting, Chen, Sihao, Chen, Yunmo, Zhang, Jingyu, Xu, Haoran, Zheng, Boyuan, Koehn, Philipp, Khashabi, Daniel

As the influence of large language models (LLMs) spans across global communities, their safety challenges in multilingual settings become paramount for alignment research. This paper examines the variations in safety challenges faced by LLMs across d

Externí odkaz: http://arxiv.org/abs/2401.13136

Zobrazit plný text záznamu

Report

Contrastive Preference Optimization: Pushing the Boundaries of LLM Performance in Machine Translation

Autor: Xu, Haoran, Sharaf, Amr, Chen, Yunmo, Tan, Weiting, Shen, Lingfeng, Van Durme, Benjamin, Murray, Kenton, Kim, Young Jin

Moderate-sized large language models (LLMs) -- those with 7B or 13B parameters -- exhibit promising machine translation (MT) performance. However, even the top-performing 13B LLM-based translation models, like ALMA, does not match the performance of

Externí odkaz: http://arxiv.org/abs/2401.08417

Zobrazit plný text záznamu

Report

Narrowing the Gap between Zero- and Few-shot Machine Translation by Matching Styles

Autor: Tan, Weiting, Xu, Haoran, Shen, Lingfeng, Li, Shuyue Stella, Murray, Kenton, Koehn, Philipp, Van Durme, Benjamin, Chen, Yunmo

Large language models trained primarily in a monolingual setting have demonstrated their ability to generalize to machine translation using zero- and few-shot examples with in-context learning. However, even though zero-shot translations are relative

Externí odkaz: http://arxiv.org/abs/2311.02310

Zobrazit plný text záznamu

Report

Do pretrained Transformers Learn In-Context by Gradient Descent?

Autor: Shen, Lingfeng, Mishra, Aayush, Khashabi, Daniel

The emergence of In-Context Learning (ICL) in LLMs remains a remarkable phenomenon that is partially understood. To explain ICL, recent studies have created theoretical connections to Gradient Descent (GD). We ask, do such connections hold up in actu

Externí odkaz: http://arxiv.org/abs/2310.08540

Zobrazit plný text záznamu

Report

SemStamp: A Semantic Watermark with Paraphrastic Robustness for Text Generation

Autor: Hou, Abe Bohan, Zhang, Jingyu, He, Tianxing, Wang, Yichen, Chuang, Yung-Sung, Wang, Hongwei, Shen, Lingfeng, Van Durme, Benjamin, Khashabi, Daniel, Tsvetkov, Yulia

Existing watermarking algorithms are vulnerable to paraphrase attacks because of their token-level design. To address this issue, we propose SemStamp, a robust sentence-level semantic watermarking algorithm based on locality-sensitive hashing (LSH),

Externí odkaz: http://arxiv.org/abs/2310.03991

Zobrazit plný text záznamu

Report

The Trickle-down Impact of Reward (In-)consistency on RLHF

Autor: Shen, Lingfeng, Chen, Sihao, Song, Linfeng, Jin, Lifeng, Peng, Baolin, Mi, Haitao, Khashabi, Daniel, Yu, Dong

Standard practice within Reinforcement Learning from Human Feedback (RLHF) involves optimizing against a Reward Model (RM), which itself is trained to reflect human preferences for desirable generations. A notable subject that is understudied is the

Externí odkaz: http://arxiv.org/abs/2309.16155

Zobrazit plný text záznamu

Report

Sen2Pro: A Probabilistic Perspective to Sentence Embedding from Pre-trained Language Model

Autor: Shen, Lingfeng, Jiang, Haiyun, Liu, Lemao, Shi, Shuming

Sentence embedding is one of the most fundamental tasks in Natural Language Processing and plays an important role in various tasks. The recent breakthrough in sentence embedding is achieved by pre-trained language models (PLMs). Despite its success,

Externí odkaz: http://arxiv.org/abs/2306.02247

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání