Zobrazeno 1 - 10
of 3 281
pro vyhledávání: '"Velickovic A"'
This study explores the intersection of neural networks and classical robotics algorithms through the Neural Algorithmic Reasoning (NAR) framework, allowing to train neural networks to effectively reason like classical robotics algorithms by learning
Externí odkaz:
http://arxiv.org/abs/2410.11031
Autor:
Barbero, Federico, Vitvitskyi, Alex, Perivolaropoulos, Christos, Pascanu, Razvan, Veličković, Petar
Positional Encodings (PEs) are a critical component of Transformer-based Large Language Models (LLMs), providing the attention mechanism with important sequence-position information. One of the most popular types of encoding used today in LLMs are Ro
Externí odkaz:
http://arxiv.org/abs/2410.06205
In spite of the plethora of success stories with graph neural networks (GNNs) on modelling graph-structured data, they are notoriously vulnerable to over-squashing, whereby tasks necessitate the mixing of information between distance pairs of nodes.
Externí odkaz:
http://arxiv.org/abs/2410.03424
Autor:
de Luca, Artur Back, Giapitzakis, George, Yang, Shenghao, Veličković, Petar, Fountoulakis, Kimon
There has been a growing interest in the ability of neural networks to solve algorithmic tasks, such as arithmetic, summary statistics, and sorting. While state-of-the-art models like Transformers have demonstrated good generalization performance on
Externí odkaz:
http://arxiv.org/abs/2410.01686
A key property of reasoning systems is the ability to make sharp decisions on their input data. For contemporary AI systems, a key carrier of sharp behaviour is the softmax function, with its capability to perform differentiable query-key lookups. It
Externí odkaz:
http://arxiv.org/abs/2410.01104
Autor:
Xu, Kaijia, Veličković, Petar
Neural algorithmic reasoning (NAR) is an emerging field that seeks to design neural networks that mimic classical algorithmic computations. Today, graph neural networks (GNNs) are widely used in neural algorithmic reasoners due to their message passi
Externí odkaz:
http://arxiv.org/abs/2409.07154
We explore graph rewiring methods that optimise commute time. Recent graph rewiring approaches facilitate long-range interactions in sparse graphs, making such rewirings commute-time-optimal on average. However, when an expert prior exists on which n
Externí odkaz:
http://arxiv.org/abs/2407.08762
Autor:
Bounsi, Wilfried, Ibarz, Borja, Dudzik, Andrew, Hamrick, Jessica B., Markeeva, Larisa, Vitvitskyi, Alex, Pascanu, Razvan, Veličković, Petar
Transformers have revolutionized machine learning with their simple yet effective architecture. Pre-training Transformers on massive text datasets from the Internet has led to unmatched generalization for natural language understanding (NLU) tasks. H
Externí odkaz:
http://arxiv.org/abs/2406.09308
Autor:
Barbero, Federico, Banino, Andrea, Kapturowski, Steven, Kumaran, Dharshan, Araújo, João G. M., Vitvitskyi, Alex, Pascanu, Razvan, Veličković, Petar
We study how information propagates in decoder-only Transformers, which are the architectural backbone of most existing frontier large language models (LLMs). We rely on a theoretical signal propagation analysis -- specifically, we analyse the repres
Externí odkaz:
http://arxiv.org/abs/2406.04267
Autor:
Markeeva, Larisa, McLeish, Sean, Ibarz, Borja, Bounsi, Wilfried, Kozlova, Olga, Vitvitskyi, Alex, Blundell, Charles, Goldstein, Tom, Schwarzschild, Avi, Veličković, Petar
Eliciting reasoning capabilities from language models (LMs) is a critical direction on the path towards building intelligent systems. Most recent studies dedicated to reasoning focus on out-of-distribution performance on procedurally-generated synthe
Externí odkaz:
http://arxiv.org/abs/2406.04229