Výsledky vyhledávání

Report

Calibrating Language Models with Adaptive Temperature Scaling

Autor: Xie, Johnathan, Chen, Annie S., Lee, Yoonho, Mitchell, Eric, Finn, Chelsea

The effectiveness of large language models (LLMs) is not only measured by their ability to generate accurate outputs but also by their calibration-how well their confidence scores reflect the probability of their outputs being correct. While unsuperv

Externí odkaz: http://arxiv.org/abs/2409.19817

Zobrazit plný text záznamu

Report

Bidirectional Decoding: Improving Action Chunking via Closed-Loop Resampling

Autor: Liu, Yuejiang, Hamid, Jubayer Ibn, Xie, Annie, Lee, Yoonho, Du, Maximilian, Finn, Chelsea

Predicting and executing a sequence of actions without intermediate replanning, known as action chunking, is increasingly used in robot learning from human demonstrations. However, its effects on learned policies remain puzzling: some studies highlig

Externí odkaz: http://arxiv.org/abs/2408.17355

Zobrazit plný text záznamu

Report

The infrastructure powering IBM's Gen AI model development

Autor: Gershon, Talia, Seelam, Seetharami, Belgodere, Brian, Bonilla, Milton, Hoang, Lan, Barnett, Danny, Chung, I-Hsin, Mohan, Apoorve, Chen, Ming-Hung, Luo, Lixiang, Walkup, Robert, Evangelinos, Constantinos, Salaria, Shweta, Dombrowa, Marc, Park, Yoonho, Kayi, Apo, Schour, Liran, Alim, Alim, Sydney, Ali, Maniotis, Pavlos, Schares, Laurent, Metzler, Bernard, Karacali-Akyamac, Bengi, Wen, Sophia, Chiba, Tatsuhiro, Choochotkaew, Sunyanan, Yoshimura, Takeshi, Misale, Claudia, Elengikal, Tonia, Connor, Kevin O, Liu, Zhuoran, Molina, Richard, Schneidenbach, Lars, Caden, James, Laibinis, Christopher, Fonseca, Carlos, Tarasov, Vasily, Sundararaman, Swaminathan, Schmuck, Frank, Guthridge, Scott, Cohn, Jeremy, Eshel, Marc, Muench, Paul, Liu, Runyu, Pointer, William, Wyskida, Drew, Krull, Bob, Rose, Ray, Wolfe, Brent, Cornejo, William, Walter, John, Malone, Colm, Perucci, Clifford, Franco, Frank, Hinds, Nigel, Calio, Bob, Druyan, Pavel, Kilduff, Robert, Kienle, John, McStay, Connor, Figueroa, Andrew, Connolly, Matthew, Fost, Edie, Roma, Gina, Fonseca, Jake, Levy, Ido, Payne, Michele, Schenkel, Ryan, Malki, Amir, Schneider, Lion, Narkhede, Aniruddha, Moshref, Shekeba, Kisin, Alexandra, Dodin, Olga, Rippon, Bill, Wrieth, Henry, Ganci, John, Colino, Johnny, Habeger-Rose, Donna, Pandey, Rakesh, Gidh, Aditya, Gaur, Aditya, Patterson, Dennis, Salmani, Samsuddin, Varma, Rambilas, Rumana, Rumana, Sharma, Shubham, Mishra, Mayank, Panda, Rameswar, Prasad, Aditya, Stallone, Matt, Zhang, Gaoyuan, Shen, Yikang, Cox, David, Puri, Ruchir, Agrawal, Dakshi, Thorstensen, Drew, Belog, Joel, Tang, Brent, Gupta, Saurabh Kumar, Biswas, Amitabha, Maheshwari, Anup, Gampel, Eran, Van Patten, Jason, Runion, Matthew, Kaki, Sai, Bogin, Yigal, Reitz, Brian, Pritko, Steve, Najam, Shahan, Nambala, Surya, Chirra, Radhika, Welp, Rick, DiMitri, Frank, Telles, Felipe, Arvelo, Amilcar, Chu, King, Seminaro, Ed, Schram, Andrew, Eickhoff, Felix, Hanson, William, Mckeever, Eric, Joseph, Dinakaran, Chaudhary, Piyush, Shivam, Piyush, Chaudhary, Puneet, Jones, Wesley, Guthrie, Robert, Bostic, Chris, Islam, Rezaul, Duersch, Steve, Sawdon, Wayne, Lewars, John, Klos, Matthew, Spriggs, Michael, McMillan, Bill, Gao, George, Kamra, Ashish, Singh, Gaurav, Curry, Marc, Katarki, Tushar, Talerico, Joe, Shi, Zenghui, Malleni, Sai Sindhur, Gallen, Erwan

AI Infrastructure plays a key role in the speed and cost-competitiveness of developing and deploying advanced AI models. The current demand for powerful AI infrastructure for model training is driven by the emergence of generative AI and foundational

Externí odkaz: http://arxiv.org/abs/2407.05467

Zobrazit plný text záznamu

Report

Self-Explainable Temporal Graph Networks based on Graph Information Bottleneck

Autor: Seo, Sangwoo, Kim, Sungwon, Jung, Jihyeong, Lee, Yoonho, Park, Chanyoung

Temporal Graph Neural Networks (TGNN) have the ability to capture both the graph topology and dynamic dependencies of interactions within a graph over time. There has been a growing need to explain the predictions of TGNN models due to the difficulty

Externí odkaz: http://arxiv.org/abs/2406.13214

Zobrazit plný text záznamu

Report

Re-Ex: Revising after Explanation Reduces the Factual Errors in LLM Responses

Autor: Kim, Juyeon, Lee, Jeongeun, Chang, Yoonho, Choi, Chanyeol, Kim, Junseong, Sohn, Jy-yong

Mitigating hallucination issues is a key challenge that must be overcome to reliably deploy large language models (LLMs) in real-world scenarios. Recently, various methods have been proposed to detect and revise factual errors in LLM-generated texts,

Externí odkaz: http://arxiv.org/abs/2402.17097

Zobrazit plný text záznamu

Report

Self-Guided Masked Autoencoders for Domain-Agnostic Self-Supervised Learning

Autor: Xie, Johnathan, Lee, Yoonho, Chen, Annie S., Finn, Chelsea

Self-supervised learning excels in learning representations from large amounts of unlabeled data, demonstrating success across multiple data modalities. Yet, extending self-supervised learning to new modalities is non-trivial because the specifics of

Externí odkaz: http://arxiv.org/abs/2402.14789

Zobrazit plný text záznamu

Report

Clarify: Improving Model Robustness With Natural Language Corrections

Autor: Lee, Yoonho, Lam, Michelle S., Vasconcelos, Helena, Bernstein, Michael S., Finn, Chelsea

The standard way to teach models is by feeding them lots of data. However, this approach often teaches models incorrect ideas because they pick up on misleading signals in the data. To prevent such misconceptions, we must necessarily provide addition

Externí odkaz: http://arxiv.org/abs/2402.03715

Zobrazit plný text záznamu

Report

AutoFT: Learning an Objective for Robust Fine-Tuning

Autor: Choi, Caroline, Lee, Yoonho, Chen, Annie, Zhou, Allan, Raghunathan, Aditi, Finn, Chelsea

Foundation models encode rich representations that can be adapted to downstream tasks by fine-tuning. However, fine-tuning a model on one data distribution often degrades performance under distribution shifts. Current approaches to robust fine-tuning

Externí odkaz: http://arxiv.org/abs/2401.10220

Zobrazit plný text záznamu

Report

SwiFT: Swin 4D fMRI Transformer

Autor: Kim, Peter Yongho, Kwon, Junbeom, Joo, Sunghwan, Bae, Sangyoon, Lee, Donggyu, Jung, Yoonho, Yoo, Shinjae, Cha, Jiook, Moon, Taesup

Modeling spatiotemporal brain dynamics from high-dimensional data, such as functional Magnetic Resonance Imaging (fMRI), is a formidable task in neuroscience. Existing approaches for fMRI analysis utilize hand-crafted features, but the process of fea

Externí odkaz: http://arxiv.org/abs/2307.05916

Zobrazit plný text záznamu

Report

Confidence-Based Model Selection: When to Take Shortcuts for Subpopulation Shifts

Autor: Chen, Annie S., Lee, Yoonho, Setlur, Amrith, Levine, Sergey, Finn, Chelsea

Effective machine learning models learn both robust features that directly determine the outcome of interest (e.g., an object with wheels is more likely to be a car), and shortcut features (e.g., an object on a road is more likely to be a car). The l

Externí odkaz: http://arxiv.org/abs/2306.11120

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání