Výsledky vyhledávání - "PANANGADEN, PRAKASH"

Report

Autor: Bacci, Giorgio, Mardare, Radu, Panangaden, Prakash, Plotkin, Gordon

We study Polynomial Lawvere logic PL, a logic defined over the Lawvere quantale of extended positive reals with sum as tensor, to which we add multiplication, thereby obtaining a semiring structure. PL is designed for complex quantitative reasoning,

Externí odkaz: http://arxiv.org/abs/2402.03543

Zobrazit plný text záznamu

Report

Behavioural pseudometrics for continuous-time diffusions

Autor: Chen, Linan, Clerc, Florence, Panangaden, Prakash

Bisimulation is a concept that captures behavioural equivalence of states in a variety of types of transition systems. It has been widely studied in a discrete-time setting where the notion of a step is fundamental. In our setting we are considering

Externí odkaz: http://arxiv.org/abs/2312.16729

Zobrazit plný text záznamu

Report

Conditions on Preference Relations that Guarantee the Existence of Optimal Policies

Autor: Carr, Jonathan Colaço, Panangaden, Prakash, Precup, Doina

Learning from Preferential Feedback (LfPF) plays an essential role in training Large Language Models, as well as certain types of interactive learning agents. However, a substantial gap exists between the theory and application of LfPF algorithms. Cu

Externí odkaz: http://arxiv.org/abs/2311.01990

Zobrazit plný text záznamu

Report

A Kernel Perspective on Behavioural Metrics for Markov Decision Processes

Autor: Castro, Pablo Samuel, Kastner, Tyler, Panangaden, Prakash, Rowland, Mark

Behavioural metrics have been shown to be an effective mechanism for constructing representations in reinforcement learning. We present a novel perspective on behavioural metrics for Markov decision processes via the use of positive definite kernels.

Externí odkaz: http://arxiv.org/abs/2310.19804

Zobrazit plný text záznamu

Report

Optimal Approximate Minimization of One-Letter Weighted Finite Automata

Autor: Lacroce, Clara, Balle, Borja, Panangaden, Prakash, Rabusseau, Guillaume

In this paper, we study the approximate minimization problem of weighted finite automata (WFAs): to compute the best possible approximation of a WFA given a bound on the number of states. By reformulating the problem in terms of Hankel matrices, we l

Externí odkaz: http://arxiv.org/abs/2306.00135

Zobrazit plný text záznamu

Report

Policy Gradient Methods in the Presence of Symmetries and State Abstractions

Autor: Panangaden, Prakash, Rezaei-Shoshtari, Sahand, Zhao, Rosie, Meger, David, Precup, Doina

Reinforcement learning (RL) on high-dimensional and complex problems relies on abstraction for improved efficiency and generalization. In this paper, we study abstraction in the continuous-control setting, and extend the definition of Markov decision

Externí odkaz: http://arxiv.org/abs/2305.05666

Zobrazit plný text záznamu

Report

Propositional Logics for the Lawvere Quantale

Autor: Bacci, Giorgio, Mardare, Radu, Panangaden, Prakash, Plotkin, Gordon

Publikováno v: Electronic Notes in Theoretical Informatics and Computer Science, Volume 3 - Proceedings of MFPS XXXIX (November 23, 2023) entics:12292

Lawvere showed that generalised metric spaces are categories enriched over $[0, \infty]$, the quantale of the positive extended reals. The statement of enrichment is a quantitative analogue of being a preorder. Towards seeking a logic for quantitativ

Externí odkaz: http://arxiv.org/abs/2302.01224

Zobrazit plný text záznamu

Report

Sum and Tensor of Quantitative Effects

Autor: Bacci, Giorgio, Mardare, Radu, Panangaden, Prakash, Plotkin, Gordon

Publikováno v: Logical Methods in Computer Science, Volume 20, Issue 4 (October 29, 2024) lmcs:10761

Inspired by the seminal work of Hyland, Plotkin, and Power on the combination of algebraic computational effects via sum and tensor, we develop an analogous theory for the combination of quantitative algebraic effects. Quantitative algebraic effects

Externí odkaz: http://arxiv.org/abs/2212.11784

Zobrazit plný text záznamu

Report

Continuous MDP Homomorphisms and Homomorphic Policy Gradient

Autor: Rezaei-Shoshtari, Sahand, Zhao, Rosie, Panangaden, Prakash, Meger, David, Precup, Doina

Abstraction has been widely studied as a way to improve the efficiency and generalization of reinforcement learning algorithms. In this paper, we study abstraction in the continuous-control setting. We extend the definition of MDP homomorphisms to en

Externí odkaz: http://arxiv.org/abs/2209.07364

Zobrazit plný text záznamu

Report

Riemannian Diffusion Models

Autor: Huang, Chin-Wei, Aghajohari, Milad, Bose, Avishek Joey, Panangaden, Prakash, Courville, Aaron

Diffusion models are recent state-of-the-art methods for image generation and likelihood estimation. In this work, we generalize continuous-time diffusion models to arbitrary Riemannian manifolds and derive a variational framework for likelihood esti

Externí odkaz: http://arxiv.org/abs/2208.07949

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání