Výsledky vyhledávání

Clustering of conversational bandits with posterior sampling for user preference learning and elicitation

Autor: Qizhi Li, Canzhe Zhao, Tong Yu, Junda Wu, Shuai Li

Publikováno v: User Modeling and User-Adapted Interaction.

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::c5ebf90305d27e37ae043dc5042be153
https://doi.org/10.1007/s11257-023-09358-x

Zobrazit plný text záznamu

Knowledge-aware Conversational Preference Elicitation with Bandit Feedback

Autor: Canzhe Zhao, Tong Yu, Zhihui Xie, Shuai Li

Publikováno v: Proceedings of the ACM Web Conference 2022.

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::fb7590452cbbd9729c2c9fd5cb497c66
https://doi.org/10.1145/3485447.3512152

Zobrazit plný text záznamu

Differentially Private Temporal Difference Learning with Stochastic Nonconvex-Strongly-Concave Optimization

Autor: Canzhe Zhao, Yanjie Ze, Jing Dong, Baoxiang Wang, Shuai Li

Temporal difference (TD) learning is a widely used method to evaluate policies in reinforcement learning. While many TD learning methods have been developed in recent years, little attention has been paid to preserving privacy and most of the existin

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::8571e60c38d027249b80cd6f36c172a8

Zobrazit plný text záznamu

Clustering of Conversational Bandits for User Preference Learning and Elicitation

Autor: Canzhe Zhao, Junda Wu, Shuai Li, Jingyang Li, Tong Yu

Publikováno v: CIKM

Conversational recommender systems elicit user preference via interactive conversational interactions. By introducing conversational key-terms, existing conversational recommenders can effectively reduce the need for extensive exploration in a tradit

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::d642fc546b668812dcbd3405c01b7536
https://doi.org/10.1145/3459637.3482328

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání