Výsledky vyhledávání - "Globerson, A."

Report

On the Utilization of Unique Node Identifiers in Graph Neural Networks

Autor: Bechler-Speicher, Maya, Eliasof, Moshe, Schönlieb, Carola-Bibiane, Gilad-Bachrach, Ran, Globerson, Amir

Graph Neural Networks have inherent representational limitations due to their message-passing structure. Recent work has suggested that these limitations can be overcome by using unique node identifiers (UIDs). Here we argue that despite the advantag

Externí odkaz: http://arxiv.org/abs/2411.02271

Zobrazit plný text záznamu

Report

Teaching Models to Improve on Tape

Autor: Bezalel, Liat, Orgad, Eyal, Globerson, Amir

Large Language Models (LLMs) often struggle when prompted to generate content under specific constraints. However, in such cases it is often easy to check whether these constraints are satisfied or violated. Recent works have shown that LLMs can bene

Externí odkaz: http://arxiv.org/abs/2411.01483

Zobrazit plný text záznamu

Report

Provable Benefits of Complex Parameterizations for Structured State Space Models

Autor: Ran-Milo, Yuval, Lumbroso, Eden, Cohen-Karlik, Edo, Giryes, Raja, Globerson, Amir, Cohen, Nadav

Structured state space models (SSMs), the core engine behind prominent neural networks such as S4 and Mamba, are linear dynamical systems adhering to a specified structure, most notably diagonal. In contrast to typical neural network modules, whose p

Externí odkaz: http://arxiv.org/abs/2410.14067

Zobrazit plný text záznamu

Report

Jamba-1.5: Hybrid Transformer-Mamba Models at Scale

Autor: Jamba Team, Lenz, Barak, Arazi, Alan, Bergman, Amir, Manevich, Avshalom, Peleg, Barak, Aviram, Ben, Almagor, Chen, Fridman, Clara, Padnos, Dan, Gissin, Daniel, Jannai, Daniel, Muhlgay, Dor, Zimberg, Dor, Gerber, Edden M, Dolev, Elad, Krakovsky, Eran, Safahi, Erez, Schwartz, Erez, Cohen, Gal, Shachaf, Gal, Rozenblum, Haim, Bata, Hofit, Blass, Ido, Magar, Inbal, Dalmedigos, Itay, Osin, Jhonathan, Fadlon, Julie, Rozman, Maria, Danos, Matan, Gokhman, Michael, Zusman, Mor, Gidron, Naama, Ratner, Nir, Gat, Noam, Rozen, Noam, Fried, Oded, Leshno, Ohad, Antverg, Omer, Abend, Omri, Lieber, Opher, Dagan, Or, Cohavi, Orit, Alon, Raz, Belson, Ro'i, Cohen, Roi, Gilad, Rom, Glozman, Roman, Lev, Shahar, Meirom, Shaked, Delbari, Tal, Ness, Tal, Asida, Tomer, Gal, Tom Ben, Braude, Tom, Pumerantz, Uriya, Cohen, Yehoshua, Belinkov, Yonatan, Globerson, Yuval, Levy, Yuval Peleg, Shoham, Yoav

We present Jamba-1.5, new instruction-tuned large language models based on our Jamba architecture. Jamba is a hybrid Transformer-Mamba mixture of experts architecture, providing high throughput and low memory usage across context lengths, while retai

Externí odkaz: http://arxiv.org/abs/2408.12570

Zobrazit plný text záznamu

Report

Visual Riddles: a Commonsense and World Knowledge Challenge for Large Vision and Language Models

Autor: Bitton-Guetta, Nitzan, Slobodkin, Aviv, Maimon, Aviya, Habba, Eliya, Rassin, Royi, Bitton, Yonatan, Szpektor, Idan, Globerson, Amir, Elovici, Yuval

Imagine observing someone scratching their arm; to understand why, additional context would be necessary. However, spotting a mosquito nearby would immediately offer a likely explanation for the person's discomfort, thereby alleviating the need for f

Externí odkaz: http://arxiv.org/abs/2407.19474

Zobrazit plný text záznamu

Report

When Can Transformers Count to n?

Autor: Yehudai, Gilad, Kaplan, Haim, Ghandeharioun, Asma, Geva, Mor, Globerson, Amir

Large language models based on the transformer architectures can solve highly complex tasks. But are there simple tasks that such models cannot solve? Here we focus on very simple counting tasks, that involve counting how many times a token in the vo

Externí odkaz: http://arxiv.org/abs/2407.15160

Zobrazit plný text záznamu

Report

Do LLMs have Consistent Values?

Autor: Rozen, Naama, Bezalel, Liat, Elidan, Gal, Globerson, Amir, Daniel, Ella

Large Language Models (LLM) technology is constantly improving towards human-like dialogue. Values are a basic driving force underlying human behavior, but little research has been done to study the values exhibited in text generated by LLMs. Here we

Externí odkaz: http://arxiv.org/abs/2407.12878

Zobrazit plný text záznamu

Report

DeciMamba: Exploring the Length Extrapolation Potential of Mamba

Autor: Ben-Kish, Assaf, Zimerman, Itamar, Abu-Hussein, Shady, Cohen, Nadav, Globerson, Amir, Wolf, Lior, Giryes, Raja

Long-range sequence processing poses a significant challenge for Transformers due to their quadratic complexity in input length. A promising alternative is Mamba, which demonstrates high performance and achieves Transformer-level capabilities while r

Externí odkaz: http://arxiv.org/abs/2406.14528

Zobrazit plný text záznamu

Report

Hopping Too Late: Exploring the Limitations of Large Language Models on Multi-Hop Queries

Autor: Biran, Eden, Gottesman, Daniela, Yang, Sohee, Geva, Mor, Globerson, Amir

Large language models (LLMs) can solve complex multi-step problems, but little is known about how these computations are implemented internally. Motivated by this, we study how LLMs answer multi-hop queries such as "The spouse of the performer of Ima

Externí odkaz: http://arxiv.org/abs/2406.12775

Zobrazit plný text záznamu

Report

Stratified Prediction-Powered Inference for Hybrid Language Model Evaluation

Autor: Fisch, Adam, Maynez, Joshua, Hofer, R. Alex, Dhingra, Bhuwan, Globerson, Amir, Cohen, William W.

Prediction-powered inference (PPI) is a method that improves statistical estimates based on limited human-labeled data. PPI achieves this by combining small amounts of human-labeled data with larger amounts of data labeled by a reasonably accurate --

Externí odkaz: http://arxiv.org/abs/2406.04291

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání