Zobrazeno 1 - 10
of 524
pro vyhledávání: '"Precup, Doina"'
Autor:
Lanctot, Marc, Larson, Kate, Kaisers, Michael, Berthet, Quentin, Gemp, Ian, Diaz, Manfred, Maura-Rivero, Roberto-Rafael, Bachrach, Yoram, Koop, Anna, Precup, Doina
A common way to drive progress of AI models and agents is to compare their performance on standardized benchmarks. Comparing the performance of general agents requires aggregating their individual performances across a potentially wide variety of dif
Externí odkaz:
http://arxiv.org/abs/2411.00119
In Deep Reinforcement Learning (RL), it is a challenge to learn representations that do not exhibit catastrophic forgetting or interference in non-stationary environments. Successor Features (SFs) offer a potential solution to this challenge. However
Externí odkaz:
http://arxiv.org/abs/2410.22133
Target-directed agents utilize self-generated targets, to guide their behaviors for better generalization. These agents are prone to blindly chasing problematic targets, resulting in worse generalization and safety catastrophes. We show that these be
Externí odkaz:
http://arxiv.org/abs/2410.07096
Research and industry are rapidly advancing the innovation and adoption of foundation model-based systems, yet the tools for managing these models have not kept pace. Understanding the provenance and lineage of models is critical for researchers, ind
Externí odkaz:
http://arxiv.org/abs/2410.02230
Autor:
Hua, Chenqing, Liu, Yong, Zhang, Dinghuai, Zhang, Odin, Luan, Sitao, Yang, Kevin K., Wolf, Guy, Precup, Doina, Zheng, Shuangjia
Enzyme design is a critical area in biotechnology, with applications ranging from drug development to synthetic biology. Traditional methods for enzyme function prediction or protein binding pocket design often fall short in capturing the dynamic and
Externí odkaz:
http://arxiv.org/abs/2410.00327
Autor:
Kumar, Aviral, Zhuang, Vincent, Agarwal, Rishabh, Su, Yi, Co-Reyes, John D, Singh, Avi, Baumli, Kate, Iqbal, Shariq, Bishop, Colton, Roelofs, Rebecca, Zhang, Lei M, McKinney, Kay, Shrivastava, Disha, Paduraru, Cosmin, Tucker, George, Precup, Doina, Behbahani, Feryal, Faust, Aleksandra
Self-correction is a highly desirable capability of large language models (LLMs), yet it has consistently been found to be largely ineffective in modern LLMs. Current methods for training self-correction typically depend on either multiple models, a
Externí odkaz:
http://arxiv.org/abs/2409.12917
Autor:
Hua, Chenqing, Zhong, Bozitao, Luan, Sitao, Hong, Liang, Wolf, Guy, Precup, Doina, Zheng, Shuangjia
Publikováno v:
38th Conference on Neural Information Processing Systems (NeurIPS 2024) Track on Datasets and Benchmarks
Enzymes, with their specific catalyzed reactions, are necessary for all aspects of life, enabling diverse biological processes and adaptations. Predicting enzyme functions is essential for understanding biological pathways, guiding drug development,
Externí odkaz:
http://arxiv.org/abs/2408.13659
Autor:
Chelu, Veronica, Precup, Doina
We apply functional acceleration to the Policy Mirror Descent (PMD) general family of algorithms, which cover a wide range of novel and fundamental methods in Reinforcement Learning (RL). Leveraging duality, we propose a momentum-based PMD update. By
Externí odkaz:
http://arxiv.org/abs/2407.16602
Autor:
Luan, Sitao, Hua, Chenqing, Lu, Qincheng, Ma, Liheng, Wu, Lirong, Wang, Xinyu, Xu, Minkai, Chang, Xiao-Wen, Precup, Doina, Ying, Rex, Li, Stan Z., Tang, Jian, Wolf, Guy, Jegelka, Stefanie
Homophily principle, \ie{} nodes with the same labels or similar attributes are more likely to be connected, has been commonly believed to be the main reason for the superiority of Graph Neural Networks (GNNs) over traditional Neural Networks (NNs) o
Externí odkaz:
http://arxiv.org/abs/2407.09618
Autor:
Ishfaq, Haque, Tan, Yixin, Yang, Yu, Lan, Qingfeng, Lu, Jianfeng, Mahmood, A. Rupam, Precup, Doina, Xu, Pan
Thompson sampling (TS) is one of the most popular exploration techniques in reinforcement learning (RL). However, most TS algorithms with theoretical guarantees are difficult to implement and not generalizable to Deep RL. While the emerging approxima
Externí odkaz:
http://arxiv.org/abs/2406.12241