Výsledky vyhledávání - "Franceschi, Luca"

Report

Autor: Chen, Yihong, Xu, Xiangxiang, Lu, Yao, Stenetorp, Pontus, Franceschi, Luca

We introduce a framework for expanding residual computational graphs using jets, operators that generalize truncated Taylor series. Our method provides a systematic approach to disentangle contributions of different computational paths to model predi

Externí odkaz: http://arxiv.org/abs/2410.06024

Zobrazit plný text záznamu

Report

Evaluating Large Language Models with fmeval

Autor: Schwöbel, Pola, Franceschi, Luca, Zafar, Muhammad Bilal, Vasist, Keerthan, Malhotra, Aman, Shenhar, Tomer, Tailor, Pinal, Yilmaz, Pinar, Diamond, Michael, Donini, Michele

fmeval is an open source library to evaluate large language models (LLMs) in a range of tasks. It helps practitioners evaluate their model for task performance and along multiple responsible AI dimensions. This paper presents the library and exposes

Externí odkaz: http://arxiv.org/abs/2407.12872

Zobrazit plný text záznamu

Report

Explaining Probabilistic Models with Distributional Values

Autor: Franceschi, Luca, Donini, Michele, Archambeau, Cédric, Seeger, Matthias

A large branch of explainable machine learning is grounded in cooperative game theory. However, research indicates that game-theoretic explanations may mislead or be hard to interpret. We argue that often there is a critical mismatch between what one

Externí odkaz: http://arxiv.org/abs/2402.09947

Zobrazit plný text záznamu

Report

DAG Learning on the Permutahedron

Autor: Zantedeschi, Valentina, Franceschi, Luca, Kaddour, Jean, Kusner, Matt J., Niculae, Vlad

We propose a continuous optimization framework for discovering a latent directed acyclic graph (DAG) from observational data. Our approach optimizes over the polytope of permutation vectors, the so-called Permutahedron, to learn a topological orderin

Externí odkaz: http://arxiv.org/abs/2301.11898

Zobrazit plný text záznamu

Report

Learning Discrete Directed Acyclic Graphs via Backpropagation

Autor: Wren, Andrew J., Minervini, Pasquale, Franceschi, Luca, Zantedeschi, Valentina

Recently continuous relaxations have been proposed in order to learn Directed Acyclic Graphs (DAGs) from data by backpropagation, instead of using combinatorial optimization. However, a number of techniques for fully discrete backpropagation could in

Externí odkaz: http://arxiv.org/abs/2210.15353

Zobrazit plný text záznamu

Report

Adaptive Perturbation-Based Gradient Estimation for Discrete Latent Variable Models

Autor: Minervini, Pasquale, Franceschi, Luca, Niepert, Mathias

The integration of discrete algorithmic components in deep learning architectures has numerous applications. Recently, Implicit Maximum Likelihood Estimation (IMLE, Niepert, Minervini, and Franceschi 2021), a class of gradient estimators for discrete

Externí odkaz: http://arxiv.org/abs/2209.04862

Zobrazit plný text záznamu

Report

ReFactor GNNs: Revisiting Factorisation-based Models from a Message-Passing Perspective

Autor: Chen, Yihong, Mishra, Pushkar, Franceschi, Luca, Minervini, Pasquale, Stenetorp, Pontus, Riedel, Sebastian

Factorisation-based Models (FMs), such as DistMult, have enjoyed enduring success for Knowledge Graph Completion (KGC) tasks, often outperforming Graph Neural Networks (GNNs). However, unlike GNNs, FMs struggle to incorporate node features and genera

Externí odkaz: http://arxiv.org/abs/2207.09980

Zobrazit plný text záznamu

Report

Implicit MLE: Backpropagating Through Discrete Exponential Family Distributions

Autor: Niepert, Mathias, Minervini, Pasquale, Franceschi, Luca

Combining discrete probability distributions and combinatorial optimization problems with neural network components has numerous applications but poses several challenges. We propose Implicit Maximum Likelihood Estimation (I-MLE), a framework for end

Externí odkaz: http://arxiv.org/abs/2106.01798

Zobrazit plný text záznamu

Report

On the Iteration Complexity of Hypergradient Computation

Autor: Grazzi, Riccardo, Franceschi, Luca, Pontil, Massimiliano, Salzo, Saverio

We study a general class of bilevel problems, consisting in the minimization of an upper-level objective which depends on the solution to a parametric fixed-point equation. Important instances arising in machine learning include hyperparameter optimi

Externí odkaz: http://arxiv.org/abs/2006.16218

Zobrazit plný text záznamu

Report

MARTHE: Scheduling the Learning Rate Via Online Hypergradients

Autor: Donini, Michele, Franceschi, Luca, Pontil, Massimiliano, Majumder, Orchid, Frasconi, Paolo

We study the problem of fitting task-specific learning rate schedules from the perspective of hyperparameter optimization, aiming at good generalization. We describe the structure of the gradient of a validation error w.r.t. the learning rate schedul

Externí odkaz: http://arxiv.org/abs/1910.08525

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání