Zobrazeno 1 - 3
of 3
pro vyhledávání: '"Gissin, Daniel"'
Autor:
Jamba Team, Lenz, Barak, Arazi, Alan, Bergman, Amir, Manevich, Avshalom, Peleg, Barak, Aviram, Ben, Almagor, Chen, Fridman, Clara, Padnos, Dan, Gissin, Daniel, Jannai, Daniel, Muhlgay, Dor, Zimberg, Dor, Gerber, Edden M, Dolev, Elad, Krakovsky, Eran, Safahi, Erez, Schwartz, Erez, Cohen, Gal, Shachaf, Gal, Rozenblum, Haim, Bata, Hofit, Blass, Ido, Magar, Inbal, Dalmedigos, Itay, Osin, Jhonathan, Fadlon, Julie, Rozman, Maria, Danos, Matan, Gokhman, Michael, Zusman, Mor, Gidron, Naama, Ratner, Nir, Gat, Noam, Rozen, Noam, Fried, Oded, Leshno, Ohad, Antverg, Omer, Abend, Omri, Lieber, Opher, Dagan, Or, Cohavi, Orit, Alon, Raz, Belson, Ro'i, Cohen, Roi, Gilad, Rom, Glozman, Roman, Lev, Shahar, Meirom, Shaked, Delbari, Tal, Ness, Tal, Asida, Tomer, Gal, Tom Ben, Braude, Tom, Pumerantz, Uriya, Cohen, Yehoshua, Belinkov, Yonatan, Globerson, Yuval, Levy, Yuval Peleg, Shoham, Yoav
We present Jamba-1.5, new instruction-tuned large language models based on our Jamba architecture. Jamba is a hybrid Transformer-Mamba mixture of experts architecture, providing high throughput and low memory usage across context lengths, while retai
Externí odkaz:
http://arxiv.org/abs/2408.12570
A leading hypothesis for the surprising generalization of neural networks is that the dynamics of gradient descent bias the model towards simple solutions, by searching through the solution space in an incremental order of complexity. We formally def
Externí odkaz:
http://arxiv.org/abs/1909.12051
Autor:
Gissin, Daniel, Shalev-Shwartz, Shai
We propose a new batch mode active learning algorithm designed for neural networks and large query batch sizes. The method, Discriminative Active Learning (DAL), poses active learning as a binary classification task, attempting to choose examples to
Externí odkaz:
http://arxiv.org/abs/1907.06347