Zobrazeno 1 - 10
of 2 123
pro vyhledávání: '"Radanović A"'
Publikováno v:
Zbornik Instituta za pedagoška istraživanja, Vol 56, Iss 1, Pp 79-97 (2024)
The research aimed to examine parental perceptions of children’s reactions and behavioral changes related to the corona virus and the measures implemented to curb the pandemic. Due to the shutdown of educational institutions, the introduction of
Externí odkaz:
https://doaj.org/article/6b77cb65bf41442ca31f70628ca52ce8
Autor:
Mandal, Debmalya, Radanovic, Goran
We study the setting of \emph{performative reinforcement learning} where the deployed policy affects both the reward, and the transition of the underlying Markov decision process. Prior work~\parencite{MTR23} has addressed this problem under the tabu
Externí odkaz:
http://arxiv.org/abs/2411.05234
We address the challenge of explaining counterfactual outcomes in multi-agent Markov decision processes. In particular, we aim to explain the total counterfactual effect of an agent's action on the outcome of a realized scenario through its influence
Externí odkaz:
http://arxiv.org/abs/2410.12539
Publikováno v:
Psihologija, Vol 54, Iss 3, Pp 323-345 (2021)
The aim of our study was to explore relations between parents’ and children’s fear of COVID–19, parents’ dispositions (emotion regulation, self-efficacy, the anxiety trait) and their distress (due to the pandemic, the national state of emerge
Externí odkaz:
https://doaj.org/article/ded59123129b47b0bc88d447f14c5650
Autor:
Radanović, Luka, Fellague, Abdelkadir, Ostojić, Dragutin, Stevanović, Dragan, Davidović, Tatjana
We consider the problem of characterizing graphs with the maximum spectral radius among the connected graphs with given numbers of vertices and edges. It is well-known that the candidates for extremal graphs are threshold graphs, but only a few parti
Externí odkaz:
http://arxiv.org/abs/2406.19209
We study data corruption robustness in offline two-player zero-sum Markov games. Given a dataset of realized trajectories of two players, an adversary is allowed to modify an $\epsilon$-fraction of it. The learner's goal is to identify an approximate
Externí odkaz:
http://arxiv.org/abs/2403.07933
Autor:
Nika, Andi, Mandal, Debmalya, Kamalaruban, Parameswaran, Tzannetos, Georgios, Radanović, Goran, Singla, Adish
In this paper, we take a step towards a deeper understanding of learning from human preferences by systematically comparing the paradigm of reinforcement learning from human feedback (RLHF) with the recently proposed paradigm of direct preference opt
Externí odkaz:
http://arxiv.org/abs/2403.01857
Autor:
Sukovic, Aleksa, Radanovic, Goran
Equipping agents with the capacity to justify made decisions using supporting evidence represents a cornerstone of accountable decision-making. Furthermore, ensuring that justifications are in line with human expectations and societal norms is vital,
Externí odkaz:
http://arxiv.org/abs/2402.15826
When Reinforcement Learning (RL) agents are deployed in practice, they might impact their environment and change its dynamics. We propose a new framework to model this phenomenon, where the current environment depends on the deployed policy as well a
Externí odkaz:
http://arxiv.org/abs/2402.09838
We study data corruption robustness for reinforcement learning with human feedback (RLHF) in an offline setting. Given an offline dataset of pairs of trajectories along with feedback about human preferences, an $\varepsilon$-fraction of the pairs is
Externí odkaz:
http://arxiv.org/abs/2402.06734