Zobrazeno 1 - 10
of 2 028
pro vyhledávání: '"Â, Zanette"'
Autor:
Sun, Hanshi, Haider, Momin, Zhang, Ruiqi, Yang, Huitao, Qiu, Jiahao, Yin, Ming, Wang, Mengdi, Bartlett, Peter, Zanette, Andrea
The safe and effective deployment of Large Language Models (LLMs) involves a critical step called alignment, which ensures that the model's responses are in accordance with human preferences. Prevalent alignment techniques, such as DPO, PPO and their
Externí odkaz:
http://arxiv.org/abs/2410.20290
This paper explores the application of Machine Learning techniques for pricing high-dimensional options within the framework of the Uncertain Volatility Model (UVM). The UVM is a robust framework that accounts for the inherent unpredictability of mar
Externí odkaz:
http://arxiv.org/abs/2407.13213
This paper extends the valuation and optimal surrender framework for variable annuities with guaranteed minimum benefits in a L\'evy equity market environment by incorporating a stochastic interest rate described by the Hull-White model. This approac
Externí odkaz:
http://arxiv.org/abs/2404.07658
Autor:
Zanette, Damián H., Samengo, Inés
Publikováno v:
Entropy 2024, 26(1), 51
The Central Limit Theorem states that, in the limit of a large number of terms, an appropriately scaled sum of independent random variables yields another random variable whose probability distribution tends to a stable distribution. The condition of
Externí odkaz:
http://arxiv.org/abs/2404.03808
A broad use case of large language models (LLMs) is in goal-directed decision-making tasks (or "agent" tasks), where an LLM needs to not just generate completions for a given prompt, but rather make intelligent decisions over a multi-turn interaction
Externí odkaz:
http://arxiv.org/abs/2402.19446
What can an agent learn in a stochastic Multi-Armed Bandit (MAB) problem from a dataset that contains just a single sample for each arm? Surprisingly, in this work, we demonstrate that even in such a data-starved setting it may still be possible to f
Externí odkaz:
http://arxiv.org/abs/2402.15703
Autor:
Zhang, Ruiqi, Zanette, Andrea
In some applications of reinforcement learning, a dataset of pre-collected experience is already available but it is also possible to acquire some additional online data to help improve the quality of the policy. However, it may be preferable to gath
Externí odkaz:
http://arxiv.org/abs/2307.04354
In this article, we introduce an algorithm called Backward Hedging, designed for hedging European and American options while considering transaction costs. The optimal strategy is determined by minimizing an appropriate loss function, which is based
Externí odkaz:
http://arxiv.org/abs/2305.06805
Autor:
Zanette, Andrea
Model-free algorithms for reinforcement learning typically require a condition called Bellman completeness in order to successfully operate off-policy with function approximation, unless additional conditions are met. However, Bellman completeness is
Externí odkaz:
http://arxiv.org/abs/2211.05311
Publikováno v:
Phys. Rev. Research 5, 023014 (2023)
We present a systematic study of the dynamical phase diagram of a periodically driven BCS system as a function of drive strength and frequency. Three different driving mechanism are considered and compared: oscillating density of states, oscillating
Externí odkaz:
http://arxiv.org/abs/2210.15693