Zobrazeno 1 - 10
of 475
pro vyhledávání: '"Verma, Arun"'
This paper considers a novel online fair division problem involving multiple agents in which a learner observes an indivisible item that has to be irrevocably allocated to one of the agents while satisfying a fairness and efficiency constraint. Exist
Externí odkaz:
http://arxiv.org/abs/2408.12845
Contextual dueling bandit is used to model the bandit problems, where a learner's goal is to find the best arm for a given context using observed noisy preference feedback over the selected arms for the past contexts. However, existing algorithms ass
Externí odkaz:
http://arxiv.org/abs/2407.17112
Large language models (LLMs) are widely used in decision-making, but their reliability, especially in critical tasks like healthcare, is not well-established. Therefore, understanding how LLMs reason and make decisions is crucial for their safe deplo
Externí odkaz:
http://arxiv.org/abs/2407.14845
Autor:
Xu, Xinyi, Wu, Zhaoxuan, Qiao, Rui, Verma, Arun, Shu, Yao, Wang, Jingtan, Niu, Xinyuan, He, Zhenfeng, Chen, Jiangwei, Zhou, Zijian, Lau, Gregory Kang Ruey, Dao, Hieu, Agussurja, Lucas, Sim, Rachael Hwee Ling, Lin, Xiaoqiang, Hu, Wenyang, Dai, Zhongxiang, Koh, Pang Wei, Low, Bryan Kian Hsiang
This position paper proposes a data-centric viewpoint of AI research, focusing on large language models (LLMs). We start by making the key observation that data is instrumental in the developmental (e.g., pretraining and fine-tuning) and inferential
Externí odkaz:
http://arxiv.org/abs/2406.14473
Autor:
Lin, Xiaoqiang, Dai, Zhongxiang, Verma, Arun, Ng, See-Kiong, Jaillet, Patrick, Low, Bryan Kian Hsiang
Large language models (LLMs) have demonstrated remarkable performances in various tasks. However, the performance of LLMs heavily depends on the input prompt, which has given rise to a number of recent works on prompt optimization. However, previous
Externí odkaz:
http://arxiv.org/abs/2405.17346
We study a novel variant of the parameterized bandits problem in which the learner can observe additional auxiliary feedback that is correlated with the observed reward. The auxiliary feedback is readily available in many real-life applications, e.g.
Externí odkaz:
http://arxiv.org/abs/2311.02715
Autor:
Singh, Tarun Pal, Verma, Arun Kumar, Rajkumar, Vincentraju, Kumar, Ravindra, Singh, Manoj Kumar, Chatli, Manish Kumar
Publikováno v:
British Food Journal, 2024, Vol. 126, Issue 9, pp. 3423-3440.
Externí odkaz:
http://www.emeraldinsight.com/doi/10.1108/BFJ-01-2024-0001
Autor:
Dai, Zhongxiang, Lau, Gregory Kang Ruey, Verma, Arun, Shu, Yao, Low, Bryan Kian Hsiang, Jaillet, Patrick
Kernelized bandits, also known as Bayesian optimization (BO), has been a prevalent method for optimizing complicated black-box reward functions. Various BO algorithms have been theoretically shown to enjoy upper bounds on their cumulative regret whic
Externí odkaz:
http://arxiv.org/abs/2310.05373