Výsledky vyhledávání - "Jin, Tiancheng"

Report

ScaffML: A Quantum Behavioral Interface Specification Language for Scaffold

Ensuring the correctness of quantum programs is crucial for quantum software quality assurance. Although various effective verification methods exist for classical programs, they cannot be applied to quantum programs due to the fundamental difference

Externí odkaz: http://arxiv.org/abs/2306.06468

Zobrazit plný text záznamu

Report

No-Regret Online Reinforcement Learning with Adversarial Losses and Transitions

Autor: Jin, Tiancheng, Liu, Junyan, Rouyer, Chloé, Chang, William, Wei, Chen-Yu, Luo, Haipeng

Existing online learning algorithms for adversarial Markov Decision Processes achieve ${O}(\sqrt{T})$ regret after $T$ rounds of interactions even if the loss functions are chosen arbitrarily by an adversary, with the caveat that the transition funct

Externí odkaz: http://arxiv.org/abs/2305.17380

Zobrazit plný text záznamu

Report

Heterogeneous Directed Hypergraph Neural Network over abstract syntax tree (AST) for Code Classification

Autor: Yang, Guang, Jin, Tiancheng, Dou, Liang

Code classification is a difficult issue in program understanding and automatic coding. Due to the elusive syntax and complicated semantics in programs, most existing studies use techniques based on abstract syntax tree (AST) and graph neural network

Externí odkaz: http://arxiv.org/abs/2305.04228

Zobrazit plný text záznamu

Report

Improved Best-of-Both-Worlds Guarantees for Multi-Armed Bandits: FTRL with General Regularizers and Multiple Optimal Arms

Autor: Jin, Tiancheng, Liu, Junyan, Luo, Haipeng

We study the problem of designing adaptive multi-armed bandit algorithms that perform optimally in both the stochastic setting and the adversarial setting simultaneously (often known as a best-of-both-world guarantee). A line of recent works shows th

Externí odkaz: http://arxiv.org/abs/2302.13534

Zobrazit plný text záznamu

Report

Near-Optimal Regret for Adversarial MDP with Delayed Bandit Feedback

Autor: Jin, Tiancheng, Lancewicki, Tal, Luo, Haipeng, Mansour, Yishay, Rosenberg, Aviv

The standard assumption in reinforcement learning (RL) is that agents observe feedback for their actions immediately. However, in practice feedback is often observed in delay. This paper studies online learning in episodic Markov decision process (MD

Externí odkaz: http://arxiv.org/abs/2201.13172

Zobrazit plný text záznamu

Report

The best of both worlds: stochastic and adversarial episodic MDPs with unknown transition

Autor: Jin, Tiancheng, Huang, Longbo, Luo, Haipeng

We consider the best-of-both-worlds problem for learning an episodic Markov Decision Process through $T$ episodes, with the goal of achieving $\widetilde{\mathcal{O}}(\sqrt{T})$ regret when the losses are adversarial and simultaneously $\mathcal{O}(\

Externí odkaz: http://arxiv.org/abs/2106.04117

Zobrazit plný text záznamu

Report

Simultaneously Learning Stochastic and Adversarial Episodic MDPs with Known Transition

Autor: Jin, Tiancheng, Luo, Haipeng

This work studies the problem of learning episodic Markov Decision Processes with known transition and bandit feedback. We develop the first algorithm with a ``best-of-both-worlds'' guarantee: it achieves $\mathcal{O}(log T)$ regret when the losses a

Externí odkaz: http://arxiv.org/abs/2006.05606

Zobrazit plný text záznamu

Report

Learning Adversarial MDPs with Bandit Feedback and Unknown Transition

Autor: Jin, Chi, Jin, Tiancheng, Luo, Haipeng, Sra, Suvrit, Yu, Tiancheng

We consider the problem of learning in episodic finite-horizon Markov decision processes with an unknown transition function, bandit feedback, and adversarial losses. We propose an efficient algorithm that achieves $\mathcal{\tilde{O}}(L|X|\sqrt{|A|T

Externí odkaz: http://arxiv.org/abs/1912.01192

Zobrazit plný text záznamu

Report

Deep Reinforcement Learning for Multi-Driver Vehicle Dispatching and Repositioning Problem

Autor: Holler, John, Vuorio, Risto, Qin, Zhiwei, Tang, Xiaocheng, Jiao, Yan, Jin, Tiancheng, Singh, Satinder, Wang, Chenxi, Ye, Jieping

Order dispatching and driver repositioning (also known as fleet management) in the face of spatially and temporally varying supply and demand are central to a ride-sharing platform marketplace. Hand-crafting heuristic solutions that account for the d

Externí odkaz: http://arxiv.org/abs/1911.11260

Zobrazit plný text záznamu

Akademický článek

Tento výsledek nelze pro nepřihlášené uživatele zobrazit.
K zobrazení výsledku je třeba se přihlásit.

Vyhledávací nástroje:

Upřesnit hledání