Zobrazeno 1 - 10
of 12 968
pro vyhledávání: '"P Azar"'
Autor:
Goldfriend, Tomer, Reichental, Israel, Naveh, Amir, Gazit, Lior, Yoran, Nadav, Alon, Ravid, Ur, Shmuel, Lahav, Shahak, Cornfeld, Eyal, Elazari, Avi, Emanuel, Peleg, Harpaz, Dor, Michaeli, Tal, Erez, Nati, Preminger, Lior, Shapira, Roman, Garcell, Erik Michael, Samimi, Or, Kisch, Sara, Hallel, Gil, Kishony, Gilad, van Wingerden, Vincent, Rosenbloom, Nathaniel A., Opher, Ori, Vax, Matan, Smoler, Ariel, Danzig, Tamuz, Schirman, Eden, Sella, Guy, Cohen, Ron, Garfunkel, Roi, Cohn, Tali, Rosemarin, Hanan, Hass, Ron, Jankiewicz, Klem, Gharra, Karam, Roth, Ori, Azar, Barak, Asban, Shahaf, Linkov, Natalia, Segman, Dror, Sahar, Ohad, Davidson, Niv, Minerbi, Nir, Naveh, Yehuda
We present a scalable, robust approach to creating quantum programs of arbitrary size and complexity. The approach is based on the true abstraction of the problem. The quantum program is expressed in terms of a high-level model together with constrai
Externí odkaz:
http://arxiv.org/abs/2412.07372
Autor:
Louzi, Azar
Stochastic gradient descent (SGD) has been a go-to algorithm for nonconvex stochastic optimization problems arising in machine learning. Its theory however often requires a strong framework to guarantee convergence properties. We hereby present a ful
Externí odkaz:
http://arxiv.org/abs/2412.06070
Autor:
Nakhl, Azar C., Harper, Ben, West, Maxwell, Dowling, Neil, Sevior, Martin, Quella, Thomas, Usman, Muhammad
This work augments the recently introduced Stabilizer Tensor Network (STN) protocol with magic state injection, reporting a new framework with significantly enhanced ability to simulate circuits with an extensive number of non-Clifford operations. Sp
Externí odkaz:
http://arxiv.org/abs/2411.12482
Publikováno v:
Trans. Theor. Math. Phys. (TTMP), vol 1(4), 2024
In solving the Brans-Dicke (BD) equations in the BD theory of gravity, their linear independence is important. This is due to fact that in solving these equations in cosmology, if the number of unknown quantities is equal to the number of independent
Externí odkaz:
http://arxiv.org/abs/2410.13316
Autor:
Azar, Eyar, Nadler, Boaz
The premise of semi-supervised learning (SSL) is that combining labeled and unlabeled data yields significantly more accurate models. Despite empirical successes, the theoretical understanding of SSL is still far from complete. In this work, we study
Externí odkaz:
http://arxiv.org/abs/2409.03335
Cr\'epey, Frikha, and Louzi (2023) introduced a multilevel stochastic approximation scheme to compute the value-at-risk of a financial loss that is only simulatable by Monte Carlo. The optimal complexity of the scheme is in $O({\varepsilon}^{-5/2})$,
Externí odkaz:
http://arxiv.org/abs/2408.06531
In this short article, using a left-invariant Randers metric $F$, we define a new left-invariant Randers metric $\tilde{F}$. We show that $F$ is of Berwald (Douglas) type if and only if $\tilde{F}$ is of Berwald (Douglas) type. In the case of Berwald
Externí odkaz:
http://arxiv.org/abs/2407.21044
Autor:
Grinsztajn, Nathan, Flet-Berliac, Yannis, Azar, Mohammad Gheshlaghi, Strub, Florian, Wu, Bill, Choi, Eugene, Cremer, Chris, Ahmadian, Arash, Chandak, Yash, Pietquin, Olivier, Geist, Matthieu
To better align Large Language Models (LLMs) with human judgment, Reinforcement Learning from Human Feedback (RLHF) learns a reward model and then optimizes it using regularized RL. Recently, direct alignment methods were introduced to learn such a f
Externí odkaz:
http://arxiv.org/abs/2406.19188
Contrastive Policy Gradient: Aligning LLMs on sequence-level scores in a supervised-friendly fashion
Autor:
Flet-Berliac, Yannis, Grinsztajn, Nathan, Strub, Florian, Choi, Eugene, Cremer, Chris, Ahmadian, Arash, Chandak, Yash, Azar, Mohammad Gheshlaghi, Pietquin, Olivier, Geist, Matthieu
Reinforcement Learning (RL) has been used to finetune Large Language Models (LLMs) using a reward model trained from preference data, to better align with human judgment. The recently introduced direct alignment methods, which are often simpler, more
Externí odkaz:
http://arxiv.org/abs/2406.19185
Both online and offline RLHF methods such as PPO and DPO have been extremely successful in aligning AI with human preferences. Despite their success, the existing methods suffer from a fundamental problem that their optimal solution is highly task-de
Externí odkaz:
http://arxiv.org/abs/2406.01660