Zobrazeno 1 - 10
of 52
pro vyhledávání: '"Hersche, Michael"'
Autor:
Thomm, Jonathan, Hersche, Michael, Camposampiero, Giacomo, Terzić, Aleksandar, Schölkopf, Bernhard, Rahimi, Abbas
We advance the recently proposed neuro-symbolic Differentiable Tree Machine, which learns tree operations using a combination of transformers and Tensor Product Representations. We investigate the architecture and propose two key components. We first
Externí odkaz:
http://arxiv.org/abs/2407.02060
Autor:
Camposampiero, Giacomo, Hersche, Michael, Terzić, Aleksandar, Wattenhofer, Roger, Sebastian, Abu, Rahimi, Abbas
We introduce the Abductive Rule Learner with Context-awareness (ARLC), a model that solves abstract reasoning tasks based on Learn-VRF. ARLC features a novel and more broadly applicable training objective for abductive reasoning, resulting in better
Externí odkaz:
http://arxiv.org/abs/2406.19121
Autor:
Wibowo, Yoga Esa, Cioflan, Cristian, Ingolfsson, Thorir Mar, Hersche, Michael, Zhao, Leo, Rahimi, Abbas, Benini, Luca
Few-Shot Class-Incremental Learning (FSCIL) enables machine learning systems to expand their inference capabilities to new classes using only a few labeled examples, without forgetting the previously learned classes. Classical backpropagation-based l
Externí odkaz:
http://arxiv.org/abs/2403.07851
Autor:
Thomm, Jonathan, Camposampiero, Giacomo, Terzic, Aleksandar, Hersche, Michael, Schölkopf, Bernhard, Rahimi, Abbas
We analyze the capabilities of Transformer language models in learning compositional discrete tasks. To this end, we evaluate training LLaMA models and prompting GPT-4 and Gemini on four tasks demanding to learn a composition of several discrete sub-
Externí odkaz:
http://arxiv.org/abs/2402.05785
Autor:
Ruffino, Samuele, Karunaratne, Geethan, Hersche, Michael, Benini, Luca, Sebastian, Abu, Rahimi, Abbas
Classification based on Zero-shot Learning (ZSL) is the ability of a model to classify inputs into novel classes on which the model has not previously seen any training examples. Providing an auxiliary descriptor in the form of a set of attributes de
Externí odkaz:
http://arxiv.org/abs/2401.16876
Abstract reasoning is a cornerstone of human intelligence, and replicating it with artificial intelligence (AI) presents an ongoing challenge. This study focuses on efficiently solving Raven's progressive matrices (RPM), a visual test for assessing a
Externí odkaz:
http://arxiv.org/abs/2401.16024
Autor:
Terzic, Aleksandar, Hersche, Michael, Karunaratne, Geethan, Benini, Luca, Sebastian, Abu, Rahimi, Abbas
MEGA is a recent transformer-based architecture, which utilizes a linear recurrent operator whose parallel computation, based on the FFT, scales as $O(LlogL)$, with $L$ being the sequence length. We build upon their approach by replacing the linear r
Externí odkaz:
http://arxiv.org/abs/2312.05605
Autor:
Menet, Nicolas, Hersche, Michael, Karunaratne, Geethan, Benini, Luca, Sebastian, Abu, Rahimi, Abbas
With the advent of deep learning, progressively larger neural networks have been designed to solve complex tasks. We take advantage of these capacity-rich models to lower the cost of inference by exploiting computation in superposition. To reduce the
Externí odkaz:
http://arxiv.org/abs/2312.02829
Autor:
Hersche, Michael, Terzic, Aleksandar, Karunaratne, Geethan, Langenegger, Jovin, Pouget, Angéline, Cherubini, Giovanni, Benini, Luca, Sebastian, Abu, Rahimi, Abbas
Distributed sparse block codes (SBCs) exhibit compact representations for encoding and manipulating symbolic data structures using fixed-width vectors. One major challenge however is to disentangle, or factorize, the distributed representation of dat
Externí odkaz:
http://arxiv.org/abs/2303.13957
Autor:
Langenegger, Jovin, Karunaratne, Geethan, Hersche, Michael, Benini, Luca, Sebastian, Abu, Rahimi, Abbas
Disentanglement of constituent factors of a sensory signal is central to perception and cognition and hence is a critical task for future artificial intelligence systems. In this paper, we present a compute engine capable of efficiently factorizing h
Externí odkaz:
http://arxiv.org/abs/2211.05052