Zobrazeno 1 - 10
of 25
pro vyhledávání: '"Yue, Yuguang"'
Adapting large language models (LLMs) to unseen tasks with in-context training samples without fine-tuning remains an important research problem. To learn a robust LLM that adapts well to unseen tasks, multiple meta-training approaches have been prop
Externí odkaz:
http://arxiv.org/abs/2405.11446
Autor:
Yue, Yuguang, Xie, Yuanpu, Wu, Huasen, Jia, Haofeng, Zhai, Shaodan, Shi, Wenzhe, Hunt, Jonathan J
Listwise ranking losses have been widely studied in recommender systems. However, new paradigms of content consumption present new challenges for ranking methods. In this work we contribute an analysis of learning to rank for personalized mobile push
Externí odkaz:
http://arxiv.org/abs/2201.07681
To improve the sample efficiency of policy-gradient based reinforcement learning algorithms, we propose implicit distributional actor-critic (IDAC) that consists of a distributional critic, built on two deep generator networks (DGNs), and a semi-impl
Externí odkaz:
http://arxiv.org/abs/2007.06159
Reinforcement learning (RL) in discrete action space is ubiquitous in real-world applications, but its complexity grows exponentially with the action-space dimension, making it challenging to apply existing on-policy gradient based deep RL algorithms
Externí odkaz:
http://arxiv.org/abs/2002.03534
Autor:
Li, Wenyuan, Wang, Zichen, Yue, Yuguang, Li, Jiayun, Speier, William, Zhou, Mingyuan, Arnold, Corey W.
In this work, we investigate semi-supervised learning (SSL) for image classification using adversarial training. Previous results have illustrated that generative adversarial networks (GANs) can be used for multiple purposes. Triple-GAN, which aims t
Externí odkaz:
http://arxiv.org/abs/1910.08540
Selecting hyperparameters for unsupervised learning problems is challenging in general due to the lack of ground truth for validation. Despite the prevalence of this issue in statistics and machine learning, especially in clustering problems, there a
Externí odkaz:
http://arxiv.org/abs/1910.08018
To address the challenge of backpropagating the gradient through categorical variables, we propose the augment-REINFORCE-swap-merge (ARSM) gradient estimator that is unbiased and has low variance. ARSM first uses variable augmentation, REINFORCE, and
Externí odkaz:
http://arxiv.org/abs/1905.01413
T-optimal designs for multi-factor polynomial regression models via a semidefinite relaxation method
We consider T-optimal experiment design problems for discriminating multi-factor polynomial regression models where the design space is defined by polynomial inequalities and the regression parameters are constrained to given convex sets. Our propose
Externí odkaz:
http://arxiv.org/abs/1807.08213
Akademický článek
Tento výsledek nelze pro nepřihlášené uživatele zobrazit.
K zobrazení výsledku je třeba se přihlásit.
K zobrazení výsledku je třeba se přihlásit.
Autor:
Gaut, Daria, Sim, Myung Shin, Yue, Yuguang, Wolf, Brian R., Abarca, Phillip A., Carroll, James M., Goldman, Jonathan W., Garon, Edward B.
Publikováno v:
In Clinical Lung Cancer January 2018 19(1):e19-e28