Výsledky vyhledávání

Report

POROver: Improving Safety and Reducing Overrefusal in Large Language Models with Overgeneration and Preference Optimization

Autor: Karaman, Batuhan K., Zabir, Ishmam, Benhaim, Alon, Chaudhary, Vishrav, Sabuncu, Mert R., Song, Xia

Balancing safety and usefulness in large language models has become a critical challenge in recent years. Models often exhibit unsafe behavior or adopt an overly cautious approach, leading to frequent overrefusal of benign prompts, which reduces thei

Externí odkaz: http://arxiv.org/abs/2410.12999

Zobrazit plný text záznamu

Report

Scaling Laws for Multilingual Language Models

Autor: He, Yifei, Benhaim, Alon, Patra, Barun, Vaddamanu, Praneetha, Ahuja, Sanchit, Chopra, Parul, Chaudhary, Vishrav, Zhao, Han, Song, Xia

We propose a novel scaling law for general-purpose decoder-only language models (LMs) trained on multilingual data, addressing the problem of balancing languages during multilingual pretraining. A primary challenge in studying multilingual scaling is

Externí odkaz: http://arxiv.org/abs/2410.12883

Zobrazit plný text záznamu

Report

On The Adaptation of Unlimiformer for Decoder-Only Transformers

Autor: Ahrabian, Kian, Benhaim, Alon, Patra, Barun, Pujara, Jay, Singhal, Saksham, Song, Xia

One of the prominent issues stifling the current generation of large language models is their limited context length. Recent proprietary models such as GPT-4 and Claude 2 have introduced longer context lengths, 8k/32k and 100k, respectively; however,

Externí odkaz: http://arxiv.org/abs/2410.01637

Zobrazit plný text záznamu

Report

Orient Anything

Autor: Scarvelis, Christopher, Benhaim, David, Zhang, Paul

Orientation estimation is a fundamental task in 3D shape analysis which consists of estimating a shape's orientation axes: its side-, up-, and front-axes. Using this data, one can rotate a shape into canonical orientation, where its orientation axes

Externí odkaz: http://arxiv.org/abs/2410.02101

Zobrazit plný text záznamu

Report

Scaling Optimal LR Across Token Horizons

Autor: Bjorck, Johan, Benhaim, Alon, Chaudhary, Vishrav, Wei, Furu, Song, Xia

State-of-the-art LLMs are powered by scaling -- scaling model size, dataset size and cluster size. It is economically infeasible to extensively tune hyperparameter for the largest runs. Instead, approximately optimal hyperparameters must be inferred

Externí odkaz: http://arxiv.org/abs/2409.19913

Zobrazit plný text záznamu

Report

Segment Any Mesh: Zero-shot Mesh Part Segmentation via Lifting Segment Anything 2 to 3D

Autor: Tang, George, Zhao, William, Ford, Logan, Benhaim, David, Zhang, Paul

We propose Segment Any Mesh (SAMesh), a novel zero-shot method for mesh part segmentation that overcomes the limitations of shape analysis-based, learning-based, and current zero-shot approaches. SAMesh operates in two phases: multimodal rendering an

Externí odkaz: http://arxiv.org/abs/2408.13679

Zobrazit plný text záznamu

Report

The Hitchhiker's Guide to Human Alignment with *PO

Autor: Ahrabian, Kian, Lin, Xihui, Patra, Barun, Chaudhary, Vishrav, Benhaim, Alon, Pujara, Jay, Song, Xia

With the growing utilization of large language models (LLMs) across domains, alignment towards human preferences has become one of the most critical aspects of training models. At the forefront of state-of-the-art human alignment methods are preferen

Externí odkaz: http://arxiv.org/abs/2407.15229

Zobrazit plný text záznamu

Report

Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone

Autor: Abdin, Marah, Aneja, Jyoti, Awadalla, Hany, Awadallah, Ahmed, Awan, Ammar Ahmad, Bach, Nguyen, Bahree, Amit, Bakhtiari, Arash, Bao, Jianmin, Behl, Harkirat, Benhaim, Alon, Bilenko, Misha, Bjorck, Johan, Bubeck, Sébastien, Cai, Martin, Cai, Qin, Chaudhary, Vishrav, Chen, Dong, Chen, Dongdong, Chen, Weizhu, Chen, Yen-Chun, Chen, Yi-Ling, Cheng, Hao, Chopra, Parul, Dai, Xiyang, Dixon, Matthew, Eldan, Ronen, Fragoso, Victor, Gao, Jianfeng, Gao, Mei, Gao, Min, Garg, Amit, Del Giorno, Allie, Goswami, Abhishek, Gunasekar, Suriya, Haider, Emman, Hao, Junheng, Hewett, Russell J., Hu, Wenxiang, Huynh, Jamie, Iter, Dan, Jacobs, Sam Ade, Javaheripi, Mojan, Jin, Xin, Karampatziakis, Nikos, Kauffmann, Piero, Khademi, Mahoud, Kim, Dongwoo, Kim, Young Jin, Kurilenko, Lev, Lee, James R., Lee, Yin Tat, Li, Yuanzhi, Li, Yunsheng, Liang, Chen, Liden, Lars, Lin, Xihui, Lin, Zeqi, Liu, Ce, Liu, Liyuan, Liu, Mengchen, Liu, Weishung, Liu, Xiaodong, Luo, Chong, Madan, Piyush, Mahmoudzadeh, Ali, Majercak, David, Mazzola, Matt, Mendes, Caio César Teodoro, Mitra, Arindam, Modi, Hardik, Nguyen, Anh, Norick, Brandon, Patra, Barun, Perez-Becker, Daniel, Portet, Thomas, Pryzant, Reid, Qin, Heyang, Radmilac, Marko, Ren, Liliang, de Rosa, Gustavo, Rosset, Corby, Roy, Sambudha, Ruwase, Olatunji, Saarikivi, Olli, Saied, Amin, Salim, Adil, Santacroce, Michael, Shah, Shital, Shang, Ning, Sharma, Hiteshi, Shen, Yelong, Shukla, Swadheen, Song, Xia, Tanaka, Masahiro, Tupini, Andrea, Vaddamanu, Praneetha, Wang, Chunyu, Wang, Guanhua, Wang, Lijuan, Wang, Shuohang, Wang, Xin, Wang, Yu, Ward, Rachel, Wen, Wen, Witte, Philipp, Wu, Haiping, Wu, Xiaoxia, Wyatt, Michael, Xiao, Bin, Xu, Can, Xu, Jiahang, Xu, Weijian, Xue, Jilong, Yadav, Sonali, Yang, Fan, Yang, Jianwei, Yang, Yifan, Yang, Ziyi, Yu, Donghan, Yuan, Lu, Zhang, Chenruidong, Zhang, Cyril, Zhang, Jianwen, Zhang, Li Lyna, Zhang, Yi, Zhang, Yue, Zhang, Yunan, Zhou, Xiren

We introduce phi-3-mini, a 3.8 billion parameter language model trained on 3.3 trillion tokens, whose overall performance, as measured by both academic benchmarks and internal testing, rivals that of models such as Mixtral 8x7B and GPT-3.5 (e.g., phi

Externí odkaz: http://arxiv.org/abs/2404.14219

Zobrazit plný text záznamu

Report

A lightweight dual-stage framework for personalized speech enhancement based on DeepFilterNet2

Autor: Serre, Thomas, Fontaine, Mathieu, Benhaim, Éric, Dutour, Geoffroy, Essid, Slim

Publikováno v: ICASSP, Apr 2024, Seoul (Korea), South Korea

Isolating the desired speaker's voice amidst multiplespeakers in a noisy acoustic context is a challenging task. Per-sonalized speech enhancement (PSE) endeavours to achievethis by leveraging prior knowledge of the speaker's voice.Recent research eff

Externí odkaz: http://arxiv.org/abs/2404.08022

Zobrazit plný text záznamu

Report

A teleportation protocol in Schwarzschild-de Sitter space

Autor: Aguilar-Gutierrez, Sergio E., Espíndola, Ricardo, Morvan-Benhaim, Edward K.

We propose a new information transfer protocol for de Sitter space, using black holes as energy reservoirs. We consider antipodal observers in pure de Sitter space in the Bunch-Davis state. They can store Hawking modes from the cosmological horizon i

Externí odkaz: http://arxiv.org/abs/2308.13516

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání