Výsledky vyhledávání

Report

Landscape of Policy Optimization for Finite Horizon MDPs with General State and Action

Autor: Chen, Xin, Hu, Yifan, Zhao, Minda

Policy gradient methods are widely used in reinforcement learning. Yet, the nonconvexity of policy optimization imposes significant challenges in understanding the global convergence of policy gradient methods. For a class of finite-horizon Markov De

Externí odkaz: http://arxiv.org/abs/2409.17138

Zobrazit plný text záznamu

Report

LSR-IGRU: Stock Trend Prediction Based on Long Short-Term Relationships and Improved GRU

Autor: Zhu, Peng, Li, Yuante, Hu, Yifan, Liu, Qinyuan, Cheng, Dawei, Liang, Yuqi

Stock price prediction is a challenging problem in the field of finance and receives widespread attention. In recent years, with the rapid development of technologies such as deep learning and graph neural networks, more research methods have begun t

Externí odkaz: http://arxiv.org/abs/2409.08282

Zobrazit plný text záznamu

Report

Multi-level Monte-Carlo Gradient Methods for Stochastic Optimization with Biased Oracles

Autor: Hu, Yifan, Wang, Jie, Chen, Xin, He, Niao

We consider stochastic optimization when one only has access to biased stochastic oracles of the objective and the gradient, and obtaining stochastic gradients with low biases comes at high costs. This setting captures various optimization paradigms,

Externí odkaz: http://arxiv.org/abs/2408.11084

Zobrazit plný text záznamu

Report

Generative Expressive Conversational Speech Synthesis

Autor: Liu, Rui, Hu, Yifan, Ren, Yi, Yin, Xiang, Li, Haizhou

Conversational Speech Synthesis (CSS) aims to express a target utterance with the proper speaking style in a user-agent conversation setting. Existing CSS methods employ effective multi-modal context modeling techniques to achieve empathy understandi

Externí odkaz: http://arxiv.org/abs/2407.21491

Zobrazit plný text záznamu

Report

Long-Term Prediction Accuracy Improvement of Data-Driven Medium-Range Global Weather Forecast

Autor: Hu, Yifan, Yin, Fukang, Zhang, Weimin, Ren, Kaijun, Song, Junqiang, Deng, Kefeng, Zhang, Di

Long-term stability stands as a crucial requirement in data-driven medium-range global weather forecasting. Spectral bias is recognized as the primary contributor to instabilities, as data-driven methods difficult to learn small-scale dynamics. In th

Externí odkaz: http://arxiv.org/abs/2407.01598

Zobrazit plný text záznamu

Report

Beyond Individual Facts: Investigating Categorical Knowledge Locality of Taxonomy and Meronomy Concepts in GPT Models

Autor: Burger, Christopher, Hu, Yifan, Le, Thai

The location of knowledge within Generative Pre-trained Transformer (GPT)-like models has seen extensive recent investigation. However, much of the work is focused towards determining locations of individual facts, with the end goal being the editing

Externí odkaz: http://arxiv.org/abs/2406.15940

Zobrazit plný text záznamu

Report

Adaptive Multi-Scale Decomposition Framework for Time Series Forecasting

Autor: Hu, Yifan, Liu, Peiyuan, Zhu, Peng, Cheng, Dawei, Dai, Tao

Transformer-based and MLP-based methods have emerged as leading approaches in time series forecasting (TSF). While Transformer-based methods excel in capturing long-range dependencies, they suffer from high computational complexities and tend to over

Externí odkaz: http://arxiv.org/abs/2406.03751

Zobrazit plný text záznamu

Report

Stochastic Bilevel Optimization with Lower-Level Contextual Markov Decision Processes

Autor: Thoma, Vinzenz, Pasztor, Barna, Krause, Andreas, Ramponi, Giorgia, Hu, Yifan

In various applications, the optimal policy in a strategic decision-making problem depends both on the environmental configuration and exogenous events. For these settings, we introduce Bilevel Optimization with Contextual Markov Decision Processes (

Externí odkaz: http://arxiv.org/abs/2406.01575

Zobrazit plný text záznamu

Report

Group Robust Preference Optimization in Reward-free RLHF

Autor: Ramesh, Shyam Sundhar, Hu, Yifan, Chaimalas, Iason, Mehta, Viraj, Sessa, Pier Giuseppe, Ammar, Haitham Bou, Bogunovic, Ilija

Adapting large language models (LLMs) for specific tasks usually involves fine-tuning through reinforcement learning with human feedback (RLHF) on preference data. While these data often come from diverse labelers' groups (e.g., different demographic

Externí odkaz: http://arxiv.org/abs/2405.20304

Zobrazit plný text záznamu

Report

Stochastic Optimization Algorithms for Instrumental Variable Regression with Streaming Data

Autor: Chen, Xuxing, Roy, Abhishek, Hu, Yifan, Balasubramanian, Krishnakumar

We develop and analyze algorithms for instrumental variable regression by viewing the problem as a conditional stochastic optimization problem. In the context of least-squares instrumental variable regression, our algorithms neither require matrix in

Externí odkaz: http://arxiv.org/abs/2405.19463

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání