Zobrazeno 1 - 10
of 167
pro vyhledávání: '"Veličković, Petar"'
In spite of the plethora of success stories with graph neural networks (GNNs) on modelling graph-structured data, they are notoriously vulnerable to over-squashing, whereby tasks necessitate the mixing of information between distance pairs of nodes.
Externí odkaz:
http://arxiv.org/abs/2410.03424
Autor:
de Luca, Artur Back, Giapitzakis, George, Yang, Shenghao, Veličković, Petar, Fountoulakis, Kimon
There has been a growing interest in the ability of neural networks to solve algorithmic tasks, such as arithmetic, summary statistics, and sorting. While state-of-the-art models like Transformers have demonstrated good generalization performance on
Externí odkaz:
http://arxiv.org/abs/2410.01686
A key property of reasoning systems is the ability to make sharp decisions on their input data. For contemporary AI systems, a key carrier of sharp behaviour is the softmax function, with its capability to perform differentiable query-key lookups. It
Externí odkaz:
http://arxiv.org/abs/2410.01104
Autor:
Xu, Kaijia, Veličković, Petar
Neural algorithmic reasoning (NAR) is an emerging field that seeks to design neural networks that mimic classical algorithmic computations. Today, graph neural networks (GNNs) are widely used in neural algorithmic reasoners due to their message passi
Externí odkaz:
http://arxiv.org/abs/2409.07154
We explore graph rewiring methods that optimise commute time. Recent graph rewiring approaches facilitate long-range interactions in sparse graphs, making such rewirings commute-time-optimal on average. However, when an expert prior exists on which n
Externí odkaz:
http://arxiv.org/abs/2407.08762
Autor:
Bounsi, Wilfried, Ibarz, Borja, Dudzik, Andrew, Hamrick, Jessica B., Markeeva, Larisa, Vitvitskyi, Alex, Pascanu, Razvan, Veličković, Petar
Transformers have revolutionized machine learning with their simple yet effective architecture. Pre-training Transformers on massive text datasets from the Internet has led to unmatched generalization for natural language understanding (NLU) tasks. H
Externí odkaz:
http://arxiv.org/abs/2406.09308
Autor:
Barbero, Federico, Banino, Andrea, Kapturowski, Steven, Kumaran, Dharshan, Araújo, João G. M., Vitvitskyi, Alex, Pascanu, Razvan, Veličković, Petar
We study how information propagates in decoder-only Transformers, which are the architectural backbone of most existing frontier large language models (LLMs). We rely on a theoretical signal propagation analysis -- specifically, we analyse the repres
Externí odkaz:
http://arxiv.org/abs/2406.04267
Autor:
Markeeva, Larisa, McLeish, Sean, Ibarz, Borja, Bounsi, Wilfried, Kozlova, Olga, Vitvitskyi, Alex, Blundell, Charles, Goldstein, Tom, Schwarzschild, Avi, Veličković, Petar
Eliciting reasoning capabilities from language models (LMs) is a critical direction on the path towards building intelligent systems. Most recent studies dedicated to reasoning focus on out-of-distribution performance on procedurally-generated synthe
Externí odkaz:
http://arxiv.org/abs/2406.04229
Evolving relations in real-world networks are often modelled by temporal graphs. Graph rewiring techniques have been utilised on Graph Neural Networks (GNNs) to improve expressiveness and increase model performance. In this work, we propose Temporal
Externí odkaz:
http://arxiv.org/abs/2406.02362
Autor:
Gavranović, Bruno, Lessard, Paul, Dudzik, Andrew, von Glehn, Tamara, Araújo, João G. M., Veličković, Petar
We present our position on the elusive quest for a general-purpose framework for specifying and studying deep learning architectures. Our opinion is that the key attempts made so far lack a coherent bridge between specifying constraints which models
Externí odkaz:
http://arxiv.org/abs/2402.15332