Zobrazeno 1 - 10
of 207
pro vyhledávání: '"Mendes Caio"'
Autor:
Abdin, Marah, Aneja, Jyoti, Behl, Harkirat, Bubeck, Sébastien, Eldan, Ronen, Gunasekar, Suriya, Harrison, Michael, Hewett, Russell J., Javaheripi, Mojan, Kauffmann, Piero, Lee, James R., Lee, Yin Tat, Li, Yuanzhi, Liu, Weishung, Mendes, Caio C. T., Nguyen, Anh, Price, Eric, de Rosa, Gustavo, Saarikivi, Olli, Salim, Adil, Shah, Shital, Wang, Xin, Ward, Rachel, Wu, Yue, Yu, Dingli, Zhang, Cyril, Zhang, Yi
We present phi-4, a 14-billion parameter language model developed with a training recipe that is centrally focused on data quality. Unlike most language models, where pre-training is based primarily on organic data sources such as web content or code
Externí odkaz:
http://arxiv.org/abs/2412.08905
Autor:
Abdin, Marah, Aneja, Jyoti, Awadalla, Hany, Awadallah, Ahmed, Awan, Ammar Ahmad, Bach, Nguyen, Bahree, Amit, Bakhtiari, Arash, Bao, Jianmin, Behl, Harkirat, Benhaim, Alon, Bilenko, Misha, Bjorck, Johan, Bubeck, Sébastien, Cai, Martin, Cai, Qin, Chaudhary, Vishrav, Chen, Dong, Chen, Dongdong, Chen, Weizhu, Chen, Yen-Chun, Chen, Yi-Ling, Cheng, Hao, Chopra, Parul, Dai, Xiyang, Dixon, Matthew, Eldan, Ronen, Fragoso, Victor, Gao, Jianfeng, Gao, Mei, Gao, Min, Garg, Amit, Del Giorno, Allie, Goswami, Abhishek, Gunasekar, Suriya, Haider, Emman, Hao, Junheng, Hewett, Russell J., Hu, Wenxiang, Huynh, Jamie, Iter, Dan, Jacobs, Sam Ade, Javaheripi, Mojan, Jin, Xin, Karampatziakis, Nikos, Kauffmann, Piero, Khademi, Mahoud, Kim, Dongwoo, Kim, Young Jin, Kurilenko, Lev, Lee, James R., Lee, Yin Tat, Li, Yuanzhi, Li, Yunsheng, Liang, Chen, Liden, Lars, Lin, Xihui, Lin, Zeqi, Liu, Ce, Liu, Liyuan, Liu, Mengchen, Liu, Weishung, Liu, Xiaodong, Luo, Chong, Madan, Piyush, Mahmoudzadeh, Ali, Majercak, David, Mazzola, Matt, Mendes, Caio César Teodoro, Mitra, Arindam, Modi, Hardik, Nguyen, Anh, Norick, Brandon, Patra, Barun, Perez-Becker, Daniel, Portet, Thomas, Pryzant, Reid, Qin, Heyang, Radmilac, Marko, Ren, Liliang, de Rosa, Gustavo, Rosset, Corby, Roy, Sambudha, Ruwase, Olatunji, Saarikivi, Olli, Saied, Amin, Salim, Adil, Santacroce, Michael, Shah, Shital, Shang, Ning, Sharma, Hiteshi, Shen, Yelong, Shukla, Swadheen, Song, Xia, Tanaka, Masahiro, Tupini, Andrea, Vaddamanu, Praneetha, Wang, Chunyu, Wang, Guanhua, Wang, Lijuan, Wang, Shuohang, Wang, Xin, Wang, Yu, Ward, Rachel, Wen, Wen, Witte, Philipp, Wu, Haiping, Wu, Xiaoxia, Wyatt, Michael, Xiao, Bin, Xu, Can, Xu, Jiahang, Xu, Weijian, Xue, Jilong, Yadav, Sonali, Yang, Fan, Yang, Jianwei, Yang, Yifan, Yang, Ziyi, Yu, Donghan, Yuan, Lu, Zhang, Chenruidong, Zhang, Cyril, Zhang, Jianwen, Zhang, Li Lyna, Zhang, Yi, Zhang, Yue, Zhang, Yunan, Zhou, Xiren
We introduce phi-3-mini, a 3.8 billion parameter language model trained on 3.3 trillion tokens, whose overall performance, as measured by both academic benchmarks and internal testing, rivals that of models such as Mixtral 8x7B and GPT-3.5 (e.g., phi
Externí odkaz:
http://arxiv.org/abs/2404.14219
Autor:
Gunasekar, Suriya, Zhang, Yi, Aneja, Jyoti, Mendes, Caio César Teodoro, Del Giorno, Allie, Gopi, Sivakanth, Javaheripi, Mojan, Kauffmann, Piero, de Rosa, Gustavo, Saarikivi, Olli, Salim, Adil, Shah, Shital, Behl, Harkirat Singh, Wang, Xin, Bubeck, Sébastien, Eldan, Ronen, Kalai, Adam Tauman, Lee, Yin Tat, Li, Yuanzhi
We introduce phi-1, a new large language model for code, with significantly smaller size than competing models: phi-1 is a Transformer-based model with 1.3B parameters, trained for 4 days on 8 A100s, using a selection of ``textbook quality" data from
Externí odkaz:
http://arxiv.org/abs/2306.11644
This work mainly concerns the -- here introduced -- category of $\mathscr Q$-sets and functional morphisms, where $\mathscr Q$ is a commutative semicartesian quantale. We describe, in detail, the limits and colimits of this complete and cocomplete ca
Externí odkaz:
http://arxiv.org/abs/2302.03123
This work is largely focused on extending D. Higgs' $\Omega$-sets to the context of quantales, following the broad program of U. H\"ohle, we explore the rich category of $\mathscr Q$-sets for strong, integral and commutative quantales, or other simil
Externí odkaz:
http://arxiv.org/abs/2302.03691
Autor:
Jawahar, Ganesh, Mukherjee, Subhabrata, Dey, Debadeepta, Abdul-Mageed, Muhammad, Lakshmanan, Laks V. S., Mendes, Caio Cesar Teodoro, de Rosa, Gustavo Henrique, Shah, Shital
Autocomplete is a task where the user inputs a piece of text, termed prompt, which is conditioned by the model to generate semantically coherent continuation. Existing works for this task have primarily focused on datasets (e.g., email, chat) with hi
Externí odkaz:
http://arxiv.org/abs/2210.03251
In this paper, we introduce a new definition of sheaves on semicartesian quantales, providing first examples and categorical properties. We note that our sheaves are similar to the standard definition of a sheaf on a locale, however, we prove in that
Externí odkaz:
http://arxiv.org/abs/2204.08351
Autor:
Girish, Sharath, Dey, Debadeepta, Joshi, Neel, Vineet, Vibhav, Shah, Shital, Mendes, Caio Cesar Teodoro, Shrivastava, Abhinav, Song, Yale
The current literature on self-supervised learning (SSL) focuses on developing learning objectives to train neural networks more effectively on unlabeled data. The typical development process involves taking well-established architectures, e.g., ResN
Externí odkaz:
http://arxiv.org/abs/2203.08130
Autor:
Javaheripi, Mojan, de Rosa, Gustavo H., Mukherjee, Subhabrata, Shah, Shital, Religa, Tomasz L., Mendes, Caio C. T., Bubeck, Sebastien, Koushanfar, Farinaz, Dey, Debadeepta
The Transformer architecture is ubiquitously used as the building block of large-scale autoregressive language models. However, finding architectures with the optimal trade-off between task performance (perplexity) and hardware constraints like peak
Externí odkaz:
http://arxiv.org/abs/2203.02094
Akademický článek
Tento výsledek nelze pro nepřihlášené uživatele zobrazit.
K zobrazení výsledku je třeba se přihlásit.
K zobrazení výsledku je třeba se přihlásit.