Zobrazeno 1 - 10
of 64 942
pro vyhledávání: '"LI, Na"'
To address deviations from expected performance in stochastic systems, we propose a risk-sensitive control synthesis method to minimize certain risk measures over the limiting stationary distribution. Specifically, we extend Worst-case Conditional Va
Externí odkaz:
http://arxiv.org/abs/2410.17581
Off-policy evaluation (OPE) is one of the most fundamental problems in reinforcement learning (RL) to estimate the expected long-term payoff of a given target policy with only experiences from another behavior policy that is potentially unknown. The
Externí odkaz:
http://arxiv.org/abs/2410.17538
Network Markov Decision Processes (MDPs), a popular model for multi-agent control, pose a significant challenge to efficient learning due to the exponential growth of the global state-action space with the number of agents. In this work, utilizing th
Externí odkaz:
http://arxiv.org/abs/2410.17221
In Bayesian optimization, a black-box function is maximized via the use of a surrogate model. We apply distributed Thompson sampling, using a Gaussian process as a surrogate model, to approach the multi-agent Bayesian optimization problem. In our dis
Externí odkaz:
http://arxiv.org/abs/2410.15543
Autor:
Du, Jin, Zhang, Xinhe, Shen, Hao, Xian, Xun, Wang, Ganghua, Zhang, Jiawei, Yang, Yuhong, Li, Na, Liu, Jia, Ding, Jie
Lifelong learning in artificial intelligence (AI) aims to mimic the biological brain's ability to continuously learn and retain knowledge, yet it faces challenges such as catastrophic forgetting. Recent neuroscience research suggests that neural acti
Externí odkaz:
http://arxiv.org/abs/2409.13997
Autor:
Talebi, Shahriar, Li, Na
In stochastic systems, risk-sensitive control balances performance with resilience to less likely events. Although existing methods rely on finite-horizon risk criteria, this paper introduces \textit{limiting-risk criteria} that capture long-term cum
Externí odkaz:
http://arxiv.org/abs/2409.10767
Autor:
Wang, Yifan, Stevens, David, Shah, Pranay, Jiang, Wenwen, Liu, Miao, Chen, Xu, Kuo, Robert, Li, Na, Gong, Boying, Lee, Daniel, Hu, Jiabo, Zhang, Ning, Kamma, Bob
The growing demand for AI training data has transformed data annotation into a global industry, but traditional approaches relying on human annotators are often time-consuming, labor-intensive, and prone to inconsistent quality. We propose the Model-
Externí odkaz:
http://arxiv.org/abs/2409.10702
Interactive preference learning systems present humans with queries as pairs of options; humans then select their preferred choice, allowing the system to infer preferences from these binary choices. While binary choice feedback is simple and widely
Externí odkaz:
http://arxiv.org/abs/2409.05798
Autor:
Wu, Chaoyi, Qiu, Pengcheng, Liu, Jinxin, Gu, Hongfei, Li, Na, Zhang, Ya, Wang, Yanfeng, Xie, Weidi
In this study, we present MedS-Bench, a comprehensive benchmark designed to evaluate the performance of large language models (LLMs) in clinical contexts. Unlike existing benchmarks that focus on multiple-choice question answering, MedS-Bench spans 1
Externí odkaz:
http://arxiv.org/abs/2408.12547
This paper is concerned with a class of linear-quadratic stochastic large-population problems with partial information, where the individual agent only has access to a noisy observation process related to the state. The dynamics of each agent follows
Externí odkaz:
http://arxiv.org/abs/2408.09652