Zobrazeno 1 - 10
of 28
pro vyhledávání: '"Cohen, Michael K."'
In reinforcement learning, if the agent's reward differs from the designers' true utility, even only rarely, the state distribution resulting from the agent's policy can be very bad, in theory and in practice. When RL policies would devolve into unde
Externí odkaz:
http://arxiv.org/abs/2410.06213
Autor:
Bengio, Yoshua, Cohen, Michael K., Malkin, Nikolay, MacDermott, Matt, Fornasiere, Damiano, Greiner, Pietro, Kaddar, Younesse
Is there a way to design powerful AI systems based on machine learning methods that would satisfy probabilistic safety guarantees? With the long-term goal of obtaining a probabilistic guarantee that would apply in every context, we consider estimatin
Externí odkaz:
http://arxiv.org/abs/2408.05284
Publikováno v:
Adv.Neur.Info.Proc.Sys. 35 (2022) 8118-8129
Gaussian processes (GPs) produce good probabilistic models of functions, but most GP kernels require $O((n+m)n^2)$ time, where $n$ is the number of data points and $m$ the number of predictive locations. We present a new kernel that allows for Gaussi
Externí odkaz:
http://arxiv.org/abs/2210.01633
Publikováno v:
Journal of Selected Areas in Information Theory 2 (2021)
Algorithmic Information Theory has inspired intractable constructions of general intelligence (AGI), and undiscovered tractable approximations are likely feasible. Reinforcement Learning (RL), the dominant paradigm by which an agent might learn to so
Externí odkaz:
http://arxiv.org/abs/2105.06268
In imitation learning, imitators and demonstrators are policies for picking actions given past interactions with the environment. If we run an imitator, we probably want events to unfold similarly to the way they would have if the demonstrator had be
Externí odkaz:
http://arxiv.org/abs/2102.08686
Autor:
Cohen, Michael K., Hutter, Marcus
If we could define the set of all bad outcomes, we could hard-code an agent which avoids them; however, in sufficiently complex environments, this is infeasible. We do not know of any general-purpose approaches in the literature to avoiding novel fai
Externí odkaz:
http://arxiv.org/abs/2006.08753
Publikováno v:
Journal of Selected Areas in Information Theory 2 (2021)
Reinforcement learners are agents that learn to pick actions that lead to high reward. Ideally, the value of a reinforcement learner's policy approaches optimality--where the optimal informed policy is the one which maximizes reward. Unfortunately, w
Externí odkaz:
http://arxiv.org/abs/2006.03357
Publikováno v:
Proc.AAAI. 34 (2020) 2467-2476
General intelligence, the ability to solve arbitrary solvable problems, is supposed by many to be artificially constructible. Narrow intelligence, the ability to solve a given particularly difficult problem, has seen impressive recent development. No
Externí odkaz:
http://arxiv.org/abs/1905.12186
Publikováno v:
Proc.IJCAI (2019) 2179-2186
Reinforcement Learning agents are expected to eventually perform well. Typically, this takes the form of a guarantee about the asymptotic behavior of an algorithm given some assumptions about the environment. We present an algorithm for a policy whos
Externí odkaz:
http://arxiv.org/abs/1903.01021
Autor:
Cohen, Michael K.1,2 mkcohen@berkeley.edu, Kolt, Noam3,4, Bengio, Yoshua5,6, Hadfield, Gillian K.2,3,7,8, Russell, Stuart1,2
Publikováno v:
Science. 4/5/2024, Vol. 384 Issue 6691, p36-38. 3p.