Zobrazeno 1 - 10
of 1 545
pro vyhledávání: '"A, Globerson"'
Autor:
Bechler-Speicher, Maya, Eliasof, Moshe, Schönlieb, Carola-Bibiane, Gilad-Bachrach, Ran, Globerson, Amir
Graph Neural Networks have inherent representational limitations due to their message-passing structure. Recent work has suggested that these limitations can be overcome by using unique node identifiers (UIDs). Here we argue that despite the advantag
Externí odkaz:
http://arxiv.org/abs/2411.02271
Large Language Models (LLMs) often struggle when prompted to generate content under specific constraints. However, in such cases it is often easy to check whether these constraints are satisfied or violated. Recent works have shown that LLMs can bene
Externí odkaz:
http://arxiv.org/abs/2411.01483
Autor:
Ran-Milo, Yuval, Lumbroso, Eden, Cohen-Karlik, Edo, Giryes, Raja, Globerson, Amir, Cohen, Nadav
Structured state space models (SSMs), the core engine behind prominent neural networks such as S4 and Mamba, are linear dynamical systems adhering to a specified structure, most notably diagonal. In contrast to typical neural network modules, whose p
Externí odkaz:
http://arxiv.org/abs/2410.14067
Autor:
Jamba Team, Lenz, Barak, Arazi, Alan, Bergman, Amir, Manevich, Avshalom, Peleg, Barak, Aviram, Ben, Almagor, Chen, Fridman, Clara, Padnos, Dan, Gissin, Daniel, Jannai, Daniel, Muhlgay, Dor, Zimberg, Dor, Gerber, Edden M, Dolev, Elad, Krakovsky, Eran, Safahi, Erez, Schwartz, Erez, Cohen, Gal, Shachaf, Gal, Rozenblum, Haim, Bata, Hofit, Blass, Ido, Magar, Inbal, Dalmedigos, Itay, Osin, Jhonathan, Fadlon, Julie, Rozman, Maria, Danos, Matan, Gokhman, Michael, Zusman, Mor, Gidron, Naama, Ratner, Nir, Gat, Noam, Rozen, Noam, Fried, Oded, Leshno, Ohad, Antverg, Omer, Abend, Omri, Lieber, Opher, Dagan, Or, Cohavi, Orit, Alon, Raz, Belson, Ro'i, Cohen, Roi, Gilad, Rom, Glozman, Roman, Lev, Shahar, Meirom, Shaked, Delbari, Tal, Ness, Tal, Asida, Tomer, Gal, Tom Ben, Braude, Tom, Pumerantz, Uriya, Cohen, Yehoshua, Belinkov, Yonatan, Globerson, Yuval, Levy, Yuval Peleg, Shoham, Yoav
We present Jamba-1.5, new instruction-tuned large language models based on our Jamba architecture. Jamba is a hybrid Transformer-Mamba mixture of experts architecture, providing high throughput and low memory usage across context lengths, while retai
Externí odkaz:
http://arxiv.org/abs/2408.12570
Autor:
Bitton-Guetta, Nitzan, Slobodkin, Aviv, Maimon, Aviya, Habba, Eliya, Rassin, Royi, Bitton, Yonatan, Szpektor, Idan, Globerson, Amir, Elovici, Yuval
Imagine observing someone scratching their arm; to understand why, additional context would be necessary. However, spotting a mosquito nearby would immediately offer a likely explanation for the person's discomfort, thereby alleviating the need for f
Externí odkaz:
http://arxiv.org/abs/2407.19474
Large language models based on the transformer architectures can solve highly complex tasks. But are there simple tasks that such models cannot solve? Here we focus on very simple counting tasks, that involve counting how many times a token in the vo
Externí odkaz:
http://arxiv.org/abs/2407.15160
Large Language Models (LLM) technology is constantly improving towards human-like dialogue. Values are a basic driving force underlying human behavior, but little research has been done to study the values exhibited in text generated by LLMs. Here we
Externí odkaz:
http://arxiv.org/abs/2407.12878
Autor:
Ben-Kish, Assaf, Zimerman, Itamar, Abu-Hussein, Shady, Cohen, Nadav, Globerson, Amir, Wolf, Lior, Giryes, Raja
Long-range sequence processing poses a significant challenge for Transformers due to their quadratic complexity in input length. A promising alternative is Mamba, which demonstrates high performance and achieves Transformer-level capabilities while r
Externí odkaz:
http://arxiv.org/abs/2406.14528
Large language models (LLMs) can solve complex multi-step problems, but little is known about how these computations are implemented internally. Motivated by this, we study how LLMs answer multi-hop queries such as "The spouse of the performer of Ima
Externí odkaz:
http://arxiv.org/abs/2406.12775
Autor:
Fisch, Adam, Maynez, Joshua, Hofer, R. Alex, Dhingra, Bhuwan, Globerson, Amir, Cohen, William W.
Prediction-powered inference (PPI) is a method that improves statistical estimates based on limited human-labeled data. PPI achieves this by combining small amounts of human-labeled data with larger amounts of data labeled by a reasonably accurate --
Externí odkaz:
http://arxiv.org/abs/2406.04291