Zobrazeno 1 - 4
of 4
pro vyhledávání: '"Dobhal, Daksh"'
This paper presents $\forall$uto$\exists$$\lor\!\land$L, a novel benchmark for scaling Large Language Model (LLM) assessment in formal tasks with clear notions of correctness, such as truth maintenance in translation and logical reasoning. $\forall$u
Externí odkaz:
http://arxiv.org/abs/2410.08437
Autor:
Dobhal, Daksh, Nagpal, Jayesh, Karia, Rushang, Verma, Pulkit, Nayyar, Rashmeet Kaur, Shah, Naman, Srivastava, Siddharth
Understanding how robots plan and execute tasks is crucial in today's world, where they are becoming more prevalent in our daily lives. However, teaching non-experts the complexities of robot planning can be challenging. This work presents an open-so
Externí odkaz:
http://arxiv.org/abs/2404.00808
$\forall$uto$\exists$val: Autonomous Assessment of LLMs in Formal Synthesis and Interpretation Tasks
This paper presents $\forall$uto$\exists$val, a new approach for scaling LLM assessment in translating formal syntax -- such as first-order logic, regular expressions, etc -- to natural language (interpretation) or vice versa (compilation), thereby f
Externí odkaz:
http://arxiv.org/abs/2403.18327
Publikováno v:
In Procedia Computer Science 2020 171:130-138