Výsledky vyhledávání - "Kim, Young Jin"

Report

Autor: Liu, Liyuan, Kim, Young Jin, Wang, Shuohang, Liang, Chen, Shen, Yelong, Cheng, Hao, Liu, Xiaodong, Tanaka, Masahiro, Wu, Xiaoxia, Hu, Wenxiang, Chaudhary, Vishrav, Lin, Zeqi, Zhang, Chenruidong, Xue, Jilong, Awadalla, Hany, Gao, Jianfeng, Chen, Weizhu

Mixture-of-Experts (MoE) models scale more effectively than dense models due to sparse computation through expert routing, selectively activating only a small subset of expert modules. However, sparse computation challenges traditional training pract

Externí odkaz: http://arxiv.org/abs/2409.12136

Zobrazit plný text záznamu

Report

Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone

Autor: Abdin, Marah, Aneja, Jyoti, Awadalla, Hany, Awadallah, Ahmed, Awan, Ammar Ahmad, Bach, Nguyen, Bahree, Amit, Bakhtiari, Arash, Bao, Jianmin, Behl, Harkirat, Benhaim, Alon, Bilenko, Misha, Bjorck, Johan, Bubeck, Sébastien, Cai, Martin, Cai, Qin, Chaudhary, Vishrav, Chen, Dong, Chen, Dongdong, Chen, Weizhu, Chen, Yen-Chun, Chen, Yi-Ling, Cheng, Hao, Chopra, Parul, Dai, Xiyang, Dixon, Matthew, Eldan, Ronen, Fragoso, Victor, Gao, Jianfeng, Gao, Mei, Gao, Min, Garg, Amit, Del Giorno, Allie, Goswami, Abhishek, Gunasekar, Suriya, Haider, Emman, Hao, Junheng, Hewett, Russell J., Hu, Wenxiang, Huynh, Jamie, Iter, Dan, Jacobs, Sam Ade, Javaheripi, Mojan, Jin, Xin, Karampatziakis, Nikos, Kauffmann, Piero, Khademi, Mahoud, Kim, Dongwoo, Kim, Young Jin, Kurilenko, Lev, Lee, James R., Lee, Yin Tat, Li, Yuanzhi, Li, Yunsheng, Liang, Chen, Liden, Lars, Lin, Xihui, Lin, Zeqi, Liu, Ce, Liu, Liyuan, Liu, Mengchen, Liu, Weishung, Liu, Xiaodong, Luo, Chong, Madan, Piyush, Mahmoudzadeh, Ali, Majercak, David, Mazzola, Matt, Mendes, Caio César Teodoro, Mitra, Arindam, Modi, Hardik, Nguyen, Anh, Norick, Brandon, Patra, Barun, Perez-Becker, Daniel, Portet, Thomas, Pryzant, Reid, Qin, Heyang, Radmilac, Marko, Ren, Liliang, de Rosa, Gustavo, Rosset, Corby, Roy, Sambudha, Ruwase, Olatunji, Saarikivi, Olli, Saied, Amin, Salim, Adil, Santacroce, Michael, Shah, Shital, Shang, Ning, Sharma, Hiteshi, Shen, Yelong, Shukla, Swadheen, Song, Xia, Tanaka, Masahiro, Tupini, Andrea, Vaddamanu, Praneetha, Wang, Chunyu, Wang, Guanhua, Wang, Lijuan, Wang, Shuohang, Wang, Xin, Wang, Yu, Ward, Rachel, Wen, Wen, Witte, Philipp, Wu, Haiping, Wu, Xiaoxia, Wyatt, Michael, Xiao, Bin, Xu, Can, Xu, Jiahang, Xu, Weijian, Xue, Jilong, Yadav, Sonali, Yang, Fan, Yang, Jianwei, Yang, Yifan, Yang, Ziyi, Yu, Donghan, Yuan, Lu, Zhang, Chenruidong, Zhang, Cyril, Zhang, Jianwen, Zhang, Li Lyna, Zhang, Yi, Zhang, Yue, Zhang, Yunan, Zhou, Xiren

We introduce phi-3-mini, a 3.8 billion parameter language model trained on 3.3 trillion tokens, whose overall performance, as measured by both academic benchmarks and internal testing, rivals that of models such as Mixtral 8x7B and GPT-3.5 (e.g., phi

Externí odkaz: http://arxiv.org/abs/2404.14219

Zobrazit plný text záznamu

Report

Resilient Microgrid Formation Considering Communication Interruptions

Autor: Zhong, Jian, Chen, Chen, Kim, Young-Jin, Huang, Yuxiong, Teng, Mengjie, Bian, Yiheng, Bie, Zhaohong

Distribution system (DS) communication failures following extreme events often degrade monitoring and control functions, thus preventing the acquisition of complete global DS component state information, on which existing post-disaster DS restoration

Externí odkaz: http://arxiv.org/abs/2403.01256

Zobrazit plný text záznamu

Report

Contrastive Preference Optimization: Pushing the Boundaries of LLM Performance in Machine Translation

Autor: Xu, Haoran, Sharaf, Amr, Chen, Yunmo, Tan, Weiting, Shen, Lingfeng, Van Durme, Benjamin, Murray, Kenton, Kim, Young Jin

Moderate-sized large language models (LLMs) -- those with 7B or 13B parameters -- exhibit promising machine translation (MT) performance. However, even the top-performing 13B LLM-based translation models, like ALMA, does not match the performance of

Externí odkaz: http://arxiv.org/abs/2401.08417

Zobrazit plný text záznamu

Report

PEMA: An Offsite-Tunable Plug-in External Memory Adaptation for Language Models

Autor: Kim, HyunJin, Kim, Young Jin, Bak, JinYeong

Pre-trained language models (PLMs) show impressive performance in various downstream NLP tasks. However, pre-training large language models demands substantial memory and training compute. Furthermore, due to the substantial resources required, many

Externí odkaz: http://arxiv.org/abs/2311.08590

Zobrazit plný text záznamu

Report

Mixture of Quantized Experts (MoQE): Complementary Effect of Low-bit Quantization and Robustness

Autor: Kim, Young Jin, Fahim, Raffy, Awadalla, Hany Hassan

Large Mixture of Experts (MoE) models could achieve state-of-the-art quality on various language tasks, including machine translation task, thanks to the efficient model scaling capability with expert parallelism. However, it has brought a fundamenta

Externí odkaz: http://arxiv.org/abs/2310.02410

Zobrazit plný text záznamu

Report

A Paradigm Shift in Machine Translation: Boosting Translation Performance of Large Language Models

Autor: Xu, Haoran, Kim, Young Jin, Sharaf, Amr, Awadalla, Hany Hassan

Generative Large Language Models (LLMs) have achieved remarkable advancements in various NLP tasks. However, these advances have not been reflected in the translation task, especially those with moderate model sizes (i.e., 7B or 13B parameters), whic

Externí odkaz: http://arxiv.org/abs/2309.11674

Zobrazit plný text záznamu

Report

Task-Based MoE for Multitask Multilingual Machine Translation

Autor: Pham, Hai, Kim, Young Jin, Mukherjee, Subhabrata, Woodruff, David P., Poczos, Barnabas, Awadalla, Hany Hassan

Mixture-of-experts (MoE) architecture has been proven a powerful method for diverse tasks in training deep models in many applications. However, current MoE implementations are task agnostic, treating all tokens from different tasks in the same manne

Externí odkaz: http://arxiv.org/abs/2308.15772

Zobrazit plný text záznamu

Report

FineQuant: Unlocking Efficiency with Fine-Grained Weight-Only Quantization for LLMs

Autor: Kim, Young Jin, Henry, Rawn, Fahim, Raffy, Awadalla, Hany Hassan

Large Language Models (LLMs) have achieved state-of-the-art performance across various language tasks but pose challenges for practical deployment due to their substantial memory requirements. Furthermore, the latest generative models suffer from hig

Externí odkaz: http://arxiv.org/abs/2308.09723

Zobrazit plný text záznamu

Akademický článek

Deep Reinforcement Learning-Based Adaptive Scheduling for Wireless Time-Sensitive Networking.

Autor: Kim, Hanjin¹ (AUTHOR) gks359@koreatech.ac.kr, Kim, Young-Jin² (AUTHOR) you359@sehan.ac.kr, Kim, Won-Tae¹ (AUTHOR) wtkim@koreatech.ac.kr

Publikováno v: Sensors (14248220). Aug2024, Vol. 24 Issue 16, p5281. 21p.

Zobrazit plný text záznamu

Plný text ve formátu HTML

Vyhledávací nástroje:

Upřesnit hledání