Výsledky vyhledávání

Report

Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone

Autor: Abdin, Marah, Aneja, Jyoti, Awadalla, Hany, Awadallah, Ahmed, Awan, Ammar Ahmad, Bach, Nguyen, Bahree, Amit, Bakhtiari, Arash, Bao, Jianmin, Behl, Harkirat, Benhaim, Alon, Bilenko, Misha, Bjorck, Johan, Bubeck, Sébastien, Cai, Martin, Cai, Qin, Chaudhary, Vishrav, Chen, Dong, Chen, Dongdong, Chen, Weizhu, Chen, Yen-Chun, Chen, Yi-Ling, Cheng, Hao, Chopra, Parul, Dai, Xiyang, Dixon, Matthew, Eldan, Ronen, Fragoso, Victor, Gao, Jianfeng, Gao, Mei, Gao, Min, Garg, Amit, Del Giorno, Allie, Goswami, Abhishek, Gunasekar, Suriya, Haider, Emman, Hao, Junheng, Hewett, Russell J., Hu, Wenxiang, Huynh, Jamie, Iter, Dan, Jacobs, Sam Ade, Javaheripi, Mojan, Jin, Xin, Karampatziakis, Nikos, Kauffmann, Piero, Khademi, Mahoud, Kim, Dongwoo, Kim, Young Jin, Kurilenko, Lev, Lee, James R., Lee, Yin Tat, Li, Yuanzhi, Li, Yunsheng, Liang, Chen, Liden, Lars, Lin, Xihui, Lin, Zeqi, Liu, Ce, Liu, Liyuan, Liu, Mengchen, Liu, Weishung, Liu, Xiaodong, Luo, Chong, Madan, Piyush, Mahmoudzadeh, Ali, Majercak, David, Mazzola, Matt, Mendes, Caio César Teodoro, Mitra, Arindam, Modi, Hardik, Nguyen, Anh, Norick, Brandon, Patra, Barun, Perez-Becker, Daniel, Portet, Thomas, Pryzant, Reid, Qin, Heyang, Radmilac, Marko, Ren, Liliang, de Rosa, Gustavo, Rosset, Corby, Roy, Sambudha, Ruwase, Olatunji, Saarikivi, Olli, Saied, Amin, Salim, Adil, Santacroce, Michael, Shah, Shital, Shang, Ning, Sharma, Hiteshi, Shen, Yelong, Shukla, Swadheen, Song, Xia, Tanaka, Masahiro, Tupini, Andrea, Vaddamanu, Praneetha, Wang, Chunyu, Wang, Guanhua, Wang, Lijuan, Wang, Shuohang, Wang, Xin, Wang, Yu, Ward, Rachel, Wen, Wen, Witte, Philipp, Wu, Haiping, Wu, Xiaoxia, Wyatt, Michael, Xiao, Bin, Xu, Can, Xu, Jiahang, Xu, Weijian, Xue, Jilong, Yadav, Sonali, Yang, Fan, Yang, Jianwei, Yang, Yifan, Yang, Ziyi, Yu, Donghan, Yuan, Lu, Zhang, Chenruidong, Zhang, Cyril, Zhang, Jianwen, Zhang, Li Lyna, Zhang, Yi, Zhang, Yue, Zhang, Yunan, Zhou, Xiren

We introduce phi-3-mini, a 3.8 billion parameter language model trained on 3.3 trillion tokens, whose overall performance, as measured by both academic benchmarks and internal testing, rivals that of models such as Mixtral 8x7B and GPT-3.5 (e.g., phi

Externí odkaz: http://arxiv.org/abs/2404.14219

Zobrazit plný text záznamu

Report

Textbooks Are All You Need

Autor: Gunasekar, Suriya, Zhang, Yi, Aneja, Jyoti, Mendes, Caio César Teodoro, Del Giorno, Allie, Gopi, Sivakanth, Javaheripi, Mojan, Kauffmann, Piero, de Rosa, Gustavo, Saarikivi, Olli, Salim, Adil, Shah, Shital, Behl, Harkirat Singh, Wang, Xin, Bubeck, Sébastien, Eldan, Ronen, Kalai, Adam Tauman, Lee, Yin Tat, Li, Yuanzhi

We introduce phi-1, a new large language model for code, with significantly smaller size than competing models: phi-1 is a Transformer-based model with 1.3B parameters, trained for 4 days on 8 A100s, using a selection of ``textbook quality" data from

Externí odkaz: http://arxiv.org/abs/2306.11644

Zobrazit plný text záznamu

Report

$\mathscr Q$-Sets and Friends: Categorical Constructions and Categorical Properties

Autor: Alvim, José Goudet, Mendes, Caio de Andrade, Mariano, Hugo Luiz

This work mainly concerns the -- here introduced -- category of $\mathscr Q$-sets and functional morphisms, where $\mathscr Q$ is a commutative semicartesian quantale. We describe, in detail, the limits and colimits of this complete and cocomplete ca

Externí odkaz: http://arxiv.org/abs/2302.03123

Zobrazit plný text záznamu

Report

$\mathscr Q$-Sets and Friends: Regarding Singleton and Gluing Completeness

Autor: Alvim, José Goudet, Mendes, Caio de Andrade, Mariano, Hugo Luiz

This work is largely focused on extending D. Higgs' $\Omega$-sets to the context of quantales, following the broad program of U. H\"ohle, we explore the rich category of $\mathscr Q$-sets for strong, integral and commutative quantales, or other simil

Externí odkaz: http://arxiv.org/abs/2302.03691

Zobrazit plný text záznamu

Report

Small Character Models Match Large Word Models for Autocomplete Under Memory Constraints

Autor: Jawahar, Ganesh, Mukherjee, Subhabrata, Dey, Debadeepta, Abdul-Mageed, Muhammad, Lakshmanan, Laks V. S., Mendes, Caio Cesar Teodoro, de Rosa, Gustavo Henrique, Shah, Shital

Autocomplete is a task where the user inputs a piece of text, termed prompt, which is conditioned by the model to generate semantically coherent continuation. Existing works for this task have primarily focused on datasets (e.g., email, chat) with hi

Externí odkaz: http://arxiv.org/abs/2210.03251

Zobrazit plný text záznamu

Report

On sheaves on semicartesian quantales and their truth values

Autor: Tenório, Ana Luiza, Mendes, Caio de Andrade, Mariano, Hugo Luiz

In this paper, we introduce a new definition of sheaves on semicartesian quantales, providing first examples and categorical properties. We note that our sheaves are similar to the standard definition of a sheaf on a locale, however, we prove in that

Externí odkaz: http://arxiv.org/abs/2204.08351

Zobrazit plný text záznamu

Report

One Network Doesn't Rule Them All: Moving Beyond Handcrafted Architectures in Self-Supervised Learning

Autor: Girish, Sharath, Dey, Debadeepta, Joshi, Neel, Vineet, Vibhav, Shah, Shital, Mendes, Caio Cesar Teodoro, Shrivastava, Abhinav, Song, Yale

The current literature on self-supervised learning (SSL) focuses on developing learning objectives to train neural networks more effectively on unlabeled data. The typical development process involves taking well-established architectures, e.g., ResN

Externí odkaz: http://arxiv.org/abs/2203.08130

Zobrazit plný text záznamu

Report

LiteTransformerSearch: Training-free Neural Architecture Search for Efficient Language Models

Autor: Javaheripi, Mojan, de Rosa, Gustavo H., Mukherjee, Subhabrata, Shah, Shital, Religa, Tomasz L., Mendes, Caio C. T., Bubeck, Sebastien, Koushanfar, Farinaz, Dey, Debadeepta

The Transformer architecture is ubiquitously used as the building block of large-scale autoregressive language models. However, finding architectures with the optimal trade-off between task performance (perplexity) and hardware constraints like peak

Externí odkaz: http://arxiv.org/abs/2203.02094

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání