Výsledky vyhledávání - "Tan, Xiaoyang"

Report

Transductive Off-policy Proximal Policy Optimization

Autor: Gan, Yaozhong, Yan, Renye, Tan, Xiaoyang, Wu, Zhe, Xing, Junliang

Proximal Policy Optimization (PPO) is a popular model-free reinforcement learning algorithm, esteemed for its simplicity and efficacy. However, due to its inherent on-policy nature, its proficiency in harnessing data from disparate policies is constr

Externí odkaz: http://arxiv.org/abs/2406.03894

Zobrazit plný text záznamu

Report

Highway Reinforcement Learning

Autor: Wang, Yuhui, Strupl, Miroslav, Faccio, Francesco, Wu, Qingyuan, Liu, Haozhe, Grudzień, Michał, Tan, Xiaoyang, Schmidhuber, Jürgen

Learning from multi-step off-policy data collected by a set of policies is a core problem of reinforcement learning (RL). Approaches based on importance sampling (IS) often suffer from large variances due to products of IS ratios. Typical IS-free met

Externí odkaz: http://arxiv.org/abs/2405.18289

Zobrazit plný text záznamu

Report

HiQA: A Hierarchical Contextual Augmentation RAG for Multi-Documents QA

Autor: Chen, Xinyue, Gao, Pengyu, Song, Jiangjiang, Tan, Xiaoyang

Retrieval-augmented generation (RAG) has rapidly advanced the language model field, particularly in question-answering (QA) systems. By integrating external documents during the response generation phase, RAG significantly enhances the accuracy and r

Externí odkaz: http://arxiv.org/abs/2402.01767

Zobrazit plný text záznamu

Report

M$^3$Net: Multilevel, Mixed and Multistage Attention Network for Salient Object Detection

Autor: Yuan, Yao, Gao, Pan, Tan, XiaoYang

Most existing salient object detection methods mostly use U-Net or feature pyramid structure, which simply aggregates feature maps of different scales, ignoring the uniqueness and interdependence of them and their respective contributions to the fina

Externí odkaz: http://arxiv.org/abs/2309.08365

Zobrazit plný text záznamu

Report

ProxyFormer: Proxy Alignment Assisted Point Cloud Completion with Missing Part Sensitive Transformer

Autor: Li, Shanshan, Gao, Pan, Tan, Xiaoyang, Wei, Mingqiang

Problems such as equipment defects or limited viewpoints will lead the captured point clouds to be incomplete. Therefore, recovering the complete point clouds from the partial ones plays an vital role in many practical tasks, and one of the keys lies

Externí odkaz: http://arxiv.org/abs/2302.14435

Zobrazit plný text záznamu

Dissertation/ Thesis

Evaluation of condition and ecosystem services of street trees in Kyoto City urban area

Autor: Tan, Xiaoyang

甲第24060号
地環博第223号
新制||地環||42(附属図書館)
学位規則第4条第1項該当
Doctor of Global Environmental Studies
Kyoto University
DFAM

Externí odkaz: http://hdl.handle.net/2433/275382

Zobrazit plný text záznamu

Report

Contextual Conservative Q-Learning for Offline Reinforcement Learning

Autor: Jiang, Ke, Yao, Jiayu, Tan, Xiaoyang

Offline reinforcement learning learns an effective policy on offline datasets without online interaction, and it attracts persistent research attention due to its potential of practical application. However, extrapolation error generated by distribut

Externí odkaz: http://arxiv.org/abs/2301.01298

Zobrazit plný text záznamu

Report

Robust Action Gap Increasing with Clipped Advantage Learning

Autor: Zhang, Zhe, Gan, Yaozhong, Tan, Xiaoyang

Advantage Learning (AL) seeks to increase the action gap between the optimal action and its competitors, so as to improve the robustness to estimation errors. However, the method becomes problematic when the optimal action induced by the approximated

Externí odkaz: http://arxiv.org/abs/2203.11677

Zobrazit plný text záznamu

Report

Smoothing Advantage Learning

Autor: Gan, Yaozhong, Zhang, Zhe, Tan, Xiaoyang

Advantage learning (AL) aims to improve the robustness of value-based reinforcement learning against estimation errors with action-gap-based regularization. Unfortunately, the method tends to be unstable in the case of function approximation. In this

Externí odkaz: http://arxiv.org/abs/2203.10445

Zobrazit plný text záznamu

Report

A Cooperative-Competitive Multi-Agent Framework for Auto-bidding in Online Advertising

Autor: Wen, Chao, Xu, Miao, Zhang, Zhilin, Zheng, Zhenzhe, Wang, Yuhui, Liu, Xiangyu, Rong, Yu, Xie, Dong, Tan, Xiaoyang, Yu, Chuan, Xu, Jian, Wu, Fan, Chen, Guihai, Zhu, Xiaoqiang, Zheng, Bo

In online advertising, auto-bidding has become an essential tool for advertisers to optimize their preferred ad performance metrics by simply expressing high-level campaign objectives and constraints. Previous works designed auto-bidding tools from t

Externí odkaz: http://arxiv.org/abs/2106.06224

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání