Zobrazeno 1 - 10
of 1 219
pro vyhledávání: '"Chang, P. D."'
Autor:
Gao, Zhaolin, Zhan, Wenhao, Chang, Jonathan D., Swamy, Gokul, Brantley, Kianté, Lee, Jason D., Sun, Wen
Large Language Models (LLMs) have achieved remarkable success at tasks like summarization that involve a single turn of interaction. However, they can still struggle with multi-turn tasks like dialogue that require long-term planning. Previous works
Externí odkaz:
http://arxiv.org/abs/2410.04612
Autor:
Yeung, Ryan, Black, David, Chen, Patrick B., Lessoway, Victoria, Reid, Janice, Rangel-Suarez, Sergio, Chang, Silvia D., Salcudean, Septimiu E.
Ultrasound is a hand-held, low-cost, non-invasive medical imaging modality which plays a vital role in diagnosing various diseases. Despite this, many rural and remote communities do not have access to ultrasound scans due to the lack of local expert
Externí odkaz:
http://arxiv.org/abs/2409.13058
Autor:
Chang, Ray D., Shumiya, Nana, McLellan, Russell A., Zhang, Yifan, Bland, Matthew P., Bahrami, Faranak, Mun, Junsik, Zhou, Chenyu, Kisslinger, Kim, Cheng, Guangming, Pakpour-Tabrizi, Alexander C., Yao, Nan, Zhu, Yimei, Liu, Mingzhao, Cava, Robert J., Gopalakrishnan, Sarang, Houck, Andrew A., de Leon, Nathalie P.
The lifetime of superconducting qubits is limited by dielectric loss, and a major source of dielectric loss is the native oxide present at the surface of the superconducting metal. Specifically, tantalum-based superconducting qubits have been demonst
Externí odkaz:
http://arxiv.org/abs/2408.13051
Traditionally, reward models used for reinforcement learning from human feedback (RLHF) are trained to directly predict preference scores without leveraging the generation capabilities of the underlying large language model (LLM). This limits the cap
Externí odkaz:
http://arxiv.org/abs/2408.11791
Autor:
Rudie, Jeffrey D., Lin, Hui-Ming, Ball, Robyn L., Jalal, Sabeena, Prevedello, Luciano M., Nicolaou, Savvas, Marinelli, Brett S., Flanders, Adam E., Magudia, Kirti, Shih, George, Davis, Melissa A., Mongan, John, Chang, Peter D., Berger, Ferco H., Hermans, Sebastiaan, Law, Meng, Richards, Tyler, Grunz, Jan-Peter, Kunz, Andreas Steven, Mathur, Shobhit, Galea-Soler, Sandro, Chung, Andrew D., Afat, Saif, Kuo, Chin-Chi, Aweidah, Layal, Campos, Ana Villanueva, Somasundaram, Arjuna, Tijmes, Felipe Antonio Sanchez, Jantarangkoon, Attaporn, Bittencourt, Leonardo Kayat, Brassil, Michael, Hajjami, Ayoub El, Dogan, Hakan, Becircic, Muris, Bharatkumar, Agrahara G., Farina, Eduardo Moreno Júdice de Mattos, Group, Dataset Curator, Group, Dataset Contributor, Group, Dataset Annotator, Colak, Errol
The RSNA Abdominal Traumatic Injury CT (RATIC) dataset is the largest publicly available collection of adult abdominal CT studies annotated for traumatic injuries. This dataset includes 4,274 studies from 23 institutions across 14 countries. The data
Externí odkaz:
http://arxiv.org/abs/2405.19595
Autor:
Gao, Zhaolin, Chang, Jonathan D., Zhan, Wenhao, Oertell, Owen, Swamy, Gokul, Brantley, Kianté, Joachims, Thorsten, Bagnell, J. Andrew, Lee, Jason D., Sun, Wen
While originally developed for continuous control problems, Proximal Policy Optimization (PPO) has emerged as the work-horse of a variety of reinforcement learning (RL) applications, including the fine-tuning of generative models. Unfortunately, PPO
Externí odkaz:
http://arxiv.org/abs/2404.16767
Adversarial imitation learning (AIL) has stood out as a dominant framework across various imitation learning (IL) applications, with Discriminator Actor Critic (DAC) (Kostrikov et al.,, 2019) demonstrating the effectiveness of off-policy learning alg
Externí odkaz:
http://arxiv.org/abs/2404.08513
Autor:
Chang, Jonathan D., Zhan, Wenhao, Oertell, Owen, Brantley, Kianté, Misra, Dipendra, Lee, Jason D., Sun, Wen
Reinforcement Learning (RL) from Human Preference-based feedback is a popular paradigm for fine-tuning generative models, which has produced impressive models such as GPT-4 and Claude3 Opus. This framework often consists of two steps: learning a rewa
Externí odkaz:
http://arxiv.org/abs/2404.08495
Reinforcement learning (RL) has improved guided image generation with diffusion models by directly optimizing rewards that capture image quality, aesthetics, and instruction following capabilities. However, the resulting generative policies inherit t
Externí odkaz:
http://arxiv.org/abs/2404.03673
Autor:
Chang, Peter D.
This paper introduces the DeepATLAS foundational model for localization tasks in the domain of high-dimensional biomedical data. Upon convergence of the proposed self-supervised objective, a pretrained model maps an input to an anatomically-consisten
Externí odkaz:
http://arxiv.org/abs/2402.09587