Výsledky vyhledávání

Report

Generating event descriptions under syntactic and semantic constraints

Autor: Cao, Angela, Holt, Faye, Chan, Jonas, Richter, Stephanie, Glass, Lelia, White, Aaron Steven

With the goal of supporting scalable lexical semantic annotation, analysis, and theorizing, we conduct a comprehensive evaluation of different methods for generating event descriptions under both syntactic constraints -- e.g. desired clause structure

Externí odkaz: http://arxiv.org/abs/2412.18496

Zobrazit plný text záznamu

Report

Safe Active Learning for Gaussian Differential Equations

Autor: Glass, Leon, Ensinger, Katharina, Zimmer, Christoph

Gaussian Process differential equations (GPODE) have recently gained momentum due to their ability to capture dynamics behavior of systems and also represent uncertainty in predictions. Prior work has described the process of training the hyperparame

Externí odkaz: http://arxiv.org/abs/2412.09053

Zobrazit plný text záznamu

Report

State-Space Large Audio Language Models

Autor: Bhati, Saurabhchand, Gong, Yuan, Karlinsky, Leonid, Kuehne, Hilde, Feris, Rogerio, Glass, James

Large Audio Language Models (LALM) combine the audio perception models and the Large Language Models (LLM) and show a remarkable ability to reason about the input audio, infer the meaning, and understand the intent. However, these systems rely on Tra

Externí odkaz: http://arxiv.org/abs/2411.15685

Zobrazit plný text záznamu

Report

Teaching VLMs to Localize Specific Objects from In-context Examples

Autor: Doveh, Sivan, Shabtay, Nimrod, Lin, Wei, Schwartz, Eli, Kuehne, Hilde, Giryes, Raja, Feris, Rogerio, Karlinsky, Leonid, Glass, James, Arbelle, Assaf, Ullman, Shimon, Mirza, M. Jehanzeb

Vision-Language Models (VLMs) have shown remarkable capabilities across diverse visual tasks, including image recognition, video understanding, and Visual Question Answering (VQA) when explicitly trained for these tasks. Despite these advances, we fi

Externí odkaz: http://arxiv.org/abs/2411.13317

Zobrazit plný text záznamu

Report

DC-Spin: A Speaker-invariant Speech Tokenizer for Spoken Language Models

Autor: Chang, Heng-Jui, Gong, Hongyu, Wang, Changhan, Glass, James, Chung, Yu-An

Spoken language models (SLMs) have gained increasing attention with advancements in text-based, decoder-only language models. SLMs process text and speech, enabling simultaneous speech understanding and generation. This paper presents Double-Codebook

Externí odkaz: http://arxiv.org/abs/2410.24177

Zobrazit plný text záznamu

Report

A Closer Look at Neural Codec Resynthesis: Bridging the Gap between Codec and Waveform Generation

Autor: Liu, Alexander H., Wang, Qirui, Gong, Yuan, Glass, James

Neural Audio Codecs, initially designed as a compression technique, have gained more attention recently for speech generation. Codec models represent each audio frame as a sequence of tokens, i.e., discrete embeddings. The discrete and low-frequency

Externí odkaz: http://arxiv.org/abs/2410.22448

Zobrazit plný text záznamu

Report

Zero-Shot Dense Retrieval with Embeddings from Relevance Feedback

Autor: Jedidi, Nour, Chuang, Yung-Sung, Shing, Leslie, Glass, James

Building effective dense retrieval systems remains difficult when relevance supervision is not available. Recent work has looked to overcome this challenge by using a Large Language Model (LLM) to generate hypothetical documents that can be used to f

Externí odkaz: http://arxiv.org/abs/2410.21242

Zobrazit plný text záznamu

Report

The Galaxy Zoo Catalogs for the Galaxy And Mass Assembly (GAMA) Survey

Autor: Holwerda, Benne W., Robertson, Clayton, Cook, Kyle, Pimbblet, Kevin A., Casura, Sarah, Sansom, Anne E., Patel, Divya, Butrum, Trevor, Glass, David H. W., Kelvin, Lee, Baldry, Ivan K., De Propris, Roberto, Bamford, Steven, Masters, Karen, Stone, Maria, Hardin, Tim, Walmsley, Mike, Liske, Jochen, Adnan, S M Rafee

Galaxy Zoo is an online project to classify morphological features in extra-galactic imaging surveys with public voting. In this paper, we compare the classifications made for two different surveys, the Dark Energy Spectroscopic Instrument (DESI) ima

Externí odkaz: http://arxiv.org/abs/2410.19985

Zobrazit plný text záznamu

Report

Decoding on Graphs: Faithful and Sound Reasoning on Knowledge Graphs through Generation of Well-Formed Chains

Autor: Li, Kun, Zhang, Tianhua, Wu, Xixin, Luo, Hongyin, Glass, James, Meng, Helen

Knowledge Graphs (KGs) can serve as reliable knowledge sources for question answering (QA) due to their structured representation of knowledge. Existing research on the utilization of KG for large language models (LLMs) prevalently relies on subgraph

Externí odkaz: http://arxiv.org/abs/2410.18415

Zobrazit plný text záznamu

Report

GLOV: Guided Large Language Models as Implicit Optimizers for Vision Language Models

Autor: Mirza, M. Jehanzeb, Zhao, Mengjie, Mao, Zhuoyuan, Doveh, Sivan, Lin, Wei, Gavrikov, Paul, Dorkenwald, Michael, Yang, Shiqi, Jha, Saurav, Wakaki, Hiromi, Mitsufuji, Yuki, Possegger, Horst, Feris, Rogerio, Karlinsky, Leonid, Glass, James

In this work, we propose a novel method (GLOV) enabling Large Language Models (LLMs) to act as implicit Optimizers for Vision-Langugage Models (VLMs) to enhance downstream vision tasks. Our GLOV meta-prompts an LLM with the downstream task descriptio

Externí odkaz: http://arxiv.org/abs/2410.06154

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání