Výsledky vyhledávání - "Qingpeng Cai"

Multi-Task Recommendations with Reinforcement Learning

Autor: Ziru Liu, Jiejie Tian, Qingpeng Cai, Xiangyu Zhao, Jingtong Gao, Shuchang Liu, Dayou Chen, Tonghao He, Dong Zheng, Peng Jiang, Kun Gai

Publikováno v: Proceedings of the ACM Web Conference 2023.

In recent years, Multi-task Learning (MTL) has yielded immense success in Recommender System (RS) applications. However, current MTL-based recommendation models tend to disregard the session-wise patterns of user-item interactions because they are pr

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::c2d917a53523939729b0c34e7a0afca5
https://doi.org/10.1145/3543507.3583467

Zobrazit plný text záznamu

Exploration and Regularization of the Latent Action Space in Recommendation

Autor: Shuchang Liu, Qingpeng Cai, Bowen Sun, Yuhao Wang, Ji Jiang, Dong Zheng, Peng Jiang, Kun Gai, Xiangyu Zhao, Yongfeng Zhang

In recommender systems, reinforcement learning solutions have effectively boosted recommendation performance because of their ability to capture long-term user-system interaction. However, the action space of the recommendation policy is a list of it

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::a169657ef4186e79b24c16d658d73634
http://arxiv.org/abs/2302.03431

Zobrazit plný text záznamu

Two-Stage Constrained Actor-Critic for Short Video Recommendation

Autor: Qingpeng Cai, Zhenghai Xue, Chi Zhang, Wanqi Xue, Shuchang Liu, Ruohan Zhan, Xueliang Wang, Tianyou Zuo, Wentao Xie, Dong Zheng, Peng Jiang, Kun Gai

The wide popularity of short videos on social media poses new opportunities and challenges to optimize recommender systems on the video-sharing platforms. Users sequentially interact with the system and provide complex and multi-faceted responses, in

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::c98d05b5066385c8f749b380a1a38546

Zobrazit plný text záznamu

ELDA: Learning Explicit Dual-Interactions for Healthcare Analytics

Autor: Qingpeng Cai, Kaiping Zheng, Beng Chin Ooi, Wei Wang, Chang Yao

Publikováno v: 2022 IEEE 38th International Conference on Data Engineering (ICDE).

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::face34416ffd77df7ef095c4745f581a
https://doi.org/10.1109/icde53745.2022.00034

Zobrazit plný text záznamu

Exploration in policy optimization through multiple paths

Autor: Ling Pan, Longbo Huang, Qingpeng Cai

Publikováno v: Autonomous Agents and Multi-Agent Systems. 35

Recent years have witnessed a tremendous improvement of deep reinforcement learning. However, a challenging problem is that an agent may suffer from inefficient exploration, particularly for on-policy methods. Previous exploration methods either rely

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::c5a6088375ec8f435bd03d03f3680d5d
https://doi.org/10.1007/s10458-021-09518-6

Zobrazit plný text záznamu

A Deep Reinforcement Learning Framework for Rebalancing Dockless Bike Sharing Systems

Autor: Zhixuan Fang, Qingpeng Cai, Longbo Huang, Ling Pan, Pingzhong Tang

Publikováno v: AAAI

Bike sharing provides an environment-friendly way for traveling and is booming all over the world. Yet, due to the high similarity of user travel patterns, the bike imbalance problem constantly occurs, especially for dockless bike sharing systems, ca

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::9213df07698158143d4464f18c6afc16
https://doi.org/10.1609/aaai.v33i01.33011393

Zobrazit plný text záznamu

Reinforcement Learning with Dynamic Boltzmann Softmax Updates

Autor: Wei Chen, Qi Meng, Longbo Huang, Ling Pan, Qingpeng Cai

Publikováno v: IJCAI

Value function estimation is an important task in reinforcement learning, i.e., prediction. The Boltzmann softmax operator is a natural value estimator and can provide several benefits. However, it does not satisfy the non-expansion property, and its

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::ed5223c1a240b1331cb50f9a23564797
https://doi.org/10.24963/ijcai.2020/276

Zobrazit plný text záznamu

Deterministic Value-Policy Gradients

Autor: Qingpeng Cai, Ling Pan, Pingzhong Tang

Publikováno v: AAAI

Reinforcement learning algorithms such as the deep deterministic policy gradient algorithm (DDPG) has been widely used in continuous control tasks. However, the model-free DDPG algorithm suffers from high sample complexity. In this paper we consider

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::95500ebd963b207ed07f026dd96533a9
http://arxiv.org/abs/1909.03939

Zobrazit plný text záznamu

Reinforcement Mechanism Design for Fraudulent Behaviour in e-Commerce

Autor: Qingpeng Cai, Aris Filos-Ratsikas, Pingzhong Tang, Yiwei Zhang

Publikováno v: Proceedings of the AAAI Conference on Artificial Intelligence. 32

In large e-commerce websites, sellers have been observed to engage in fraudulent behaviour, faking historical transactions in order to receive favourable treatment from the platforms, specifically through the allocation of additional buyer impression

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::925e7dbd27b0a3126ca0980a469bea02
https://doi.org/10.1609/aaai.v32i1.11452

Zobrazit plný text záznamu

Reinforcement Mechanism Design for e-commerce

Autor: Yiwei Zhang, Pingzhong Tang, Aris Filos-Ratsikas, Qingpeng Cai

Publikováno v: WWW
Cai, Q, Filos-Ratsikas, A, Tang, P & Zhang, Y 2018, Reinforcement Mechanism Design for E-Commerce . in Proceedings of the 2018 World Wide Web Conference . WWW '18, pp. 1339–1348, The Web Conference 2018, Lyon, France, 23/04/18 . https://doi.org/10.1145/3178876.3186039

We study the problem of allocating impressions to sellers in e-commerce websites, such as Amazon, eBay or Taobao, aiming to maximize the total revenue generated by the platform. We employ a general framework of reinforcement mechanism design, which u

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::a443b0fc65d383f2f7ad3e3abea7f527
https://doi.org/10.1145/3178876.3186039

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání