Výsledky vyhledávání

Report

LoRA Done RITE: Robust Invariant Transformation Equilibration for LoRA Optimization

Autor: Yen, Jui-Nan, Si, Si, Meng, Zhao, Yu, Felix, Duvvuri, Sai Surya, Dhillon, Inderjit S., Hsieh, Cho-Jui, Kumar, Sanjiv

Low-rank adaption (LoRA) is a widely used parameter-efficient finetuning method for LLM that reduces memory requirements. However, current LoRA optimizers lack transformation invariance, meaning the actual updates to the weights depends on how the tw

Externí odkaz: http://arxiv.org/abs/2410.20625

Zobrazit plný text záznamu

Report

Learning Efficient Representations of Neutrino Telescope Events

Autor: Yu, Felix J., Kamp, Nicholas, Argüelles, Carlos A.

Neutrino telescopes detect rare interactions of particles produced in some of the most extreme environments in the Universe. This is accomplished by instrumenting a cubic-kilometer volume of naturally occurring transparent medium with light sensors.

Externí odkaz: http://arxiv.org/abs/2410.13148

Zobrazit plný text záznamu

Report

Baby Bear: Seeking a Just Right Rating Scale for Scalar Annotations

Autor: Han, Xu, Yu, Felix, Sedoc, Joao, Van Durme, Benjamin

Our goal is a mechanism for efficiently assigning scalar ratings to each of a large set of elements. For example, "what percent positive or negative is this product review?" When sample sizes are small, prior work has advocated for methods such as Be

Externí odkaz: http://arxiv.org/abs/2408.09765

Zobrazit plný text záznamu

Report

Enhancing Events in Neutrino Telescopes through Deep Learning-Driven Super-Resolution

Autor: Yu, Felix J., Kamp, Nicholas, Argüelles, Carlos A.

Recent discoveries by neutrino telescopes, such as the IceCube Neutrino Observatory, relied extensively on machine learning (ML) tools to infer physical quantities from the raw photon hits detected. Neutrino telescope reconstruction algorithms are li

Externí odkaz: http://arxiv.org/abs/2408.08474

Zobrazit plný text záznamu

Report

Efficient Document Ranking with Learnable Late Interactions

Autor: Ji, Ziwei, Jain, Himanshu, Veit, Andreas, Reddi, Sashank J., Jayasumana, Sadeep, Rawat, Ankit Singh, Menon, Aditya Krishna, Yu, Felix, Kumar, Sanjiv

Cross-Encoder (CE) and Dual-Encoder (DE) models are two fundamental approaches for query-document relevance in information retrieval. To predict relevance, CE models use joint query-document embeddings, while DE models maintain factorized query and d

Externí odkaz: http://arxiv.org/abs/2406.17968

Zobrazit plný text záznamu

Report

Large Language Models are Interpretable Learners

Autor: Wang, Ruochen, Si, Si, Yu, Felix, Wiesmann, Dorothea, Hsieh, Cho-Jui, Dhillon, Inderjit

The trade-off between expressiveness and interpretability remains a core challenge when building human-centric predictive models for classification and decision-making. While symbolic rules offer interpretability, they often lack expressiveness, wher

Externí odkaz: http://arxiv.org/abs/2406.17224

Zobrazit plný text záznamu

Report

Consistent Electroweak Phenomenology of a Nearly Degenerate $Z'$ Boson

Autor: Chiatto, Prisco Lo, Yu, Felix

Extracting constraints on kinetic mixing between a new $U(1)'$ gauge boson hiding under the Standard Model $Z$ boson resonance requires the formalism of non-Hermitian two-point correlation functions at 1-loop order. We derive self-consistent collider

Externí odkaz: http://arxiv.org/abs/2405.03396

Zobrazit plný text záznamu

Report

Regression-aware Inference with LLMs

Autor: Lukasik, Michal, Narasimhan, Harikrishna, Menon, Aditya Krishna, Yu, Felix, Kumar, Sanjiv

Publikováno v: EMNLP Findings 2024

Large language models (LLMs) have shown strong results on a range of applications, including regression and scoring tasks. Typically, one obtains outputs from an LLM via autoregressive sampling from the model's output distribution. We show that this

Externí odkaz: http://arxiv.org/abs/2403.04182

Zobrazit plný text záznamu

Report

ReST meets ReAct: Self-Improvement for Multi-Step Reasoning LLM Agent

Autor: Aksitov, Renat, Miryoosefi, Sobhan, Li, Zonglin, Li, Daliang, Babayan, Sheila, Kopparapu, Kavya, Fisher, Zachary, Guo, Ruiqi, Prakash, Sushant, Srinivasan, Pranesh, Zaheer, Manzil, Yu, Felix, Kumar, Sanjiv

Answering complex natural language questions often necessitates multi-step reasoning and integrating external information. Several systems have combined knowledge retrieval with a large language model (LLM) to answer such questions. These systems, ho

Externí odkaz: http://arxiv.org/abs/2312.10003

Zobrazit plný text záznamu

Report

Automatic Engineering of Long Prompts

Autor: Hsieh, Cho-Jui, Si, Si, Yu, Felix X., Dhillon, Inderjit S.

Large language models (LLMs) have demonstrated remarkable capabilities in solving complex open-domain tasks, guided by comprehensive instructions and demonstrations provided in the form of prompts. However, these prompts can be lengthy, often compris

Externí odkaz: http://arxiv.org/abs/2311.10117

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání