Zobrazeno 1 - 10
of 86 203
pro vyhledávání: '"A, GLASS"'
With the goal of supporting scalable lexical semantic annotation, analysis, and theorizing, we conduct a comprehensive evaluation of different methods for generating event descriptions under both syntactic constraints -- e.g. desired clause structure
Externí odkaz:
http://arxiv.org/abs/2412.18496
Gaussian Process differential equations (GPODE) have recently gained momentum due to their ability to capture dynamics behavior of systems and also represent uncertainty in predictions. Prior work has described the process of training the hyperparame
Externí odkaz:
http://arxiv.org/abs/2412.09053
Autor:
Bhati, Saurabhchand, Gong, Yuan, Karlinsky, Leonid, Kuehne, Hilde, Feris, Rogerio, Glass, James
Large Audio Language Models (LALM) combine the audio perception models and the Large Language Models (LLM) and show a remarkable ability to reason about the input audio, infer the meaning, and understand the intent. However, these systems rely on Tra
Externí odkaz:
http://arxiv.org/abs/2411.15685
Autor:
Doveh, Sivan, Shabtay, Nimrod, Lin, Wei, Schwartz, Eli, Kuehne, Hilde, Giryes, Raja, Feris, Rogerio, Karlinsky, Leonid, Glass, James, Arbelle, Assaf, Ullman, Shimon, Mirza, M. Jehanzeb
Vision-Language Models (VLMs) have shown remarkable capabilities across diverse visual tasks, including image recognition, video understanding, and Visual Question Answering (VQA) when explicitly trained for these tasks. Despite these advances, we fi
Externí odkaz:
http://arxiv.org/abs/2411.13317
Spoken language models (SLMs) have gained increasing attention with advancements in text-based, decoder-only language models. SLMs process text and speech, enabling simultaneous speech understanding and generation. This paper presents Double-Codebook
Externí odkaz:
http://arxiv.org/abs/2410.24177
Neural Audio Codecs, initially designed as a compression technique, have gained more attention recently for speech generation. Codec models represent each audio frame as a sequence of tokens, i.e., discrete embeddings. The discrete and low-frequency
Externí odkaz:
http://arxiv.org/abs/2410.22448
Building effective dense retrieval systems remains difficult when relevance supervision is not available. Recent work has looked to overcome this challenge by using a Large Language Model (LLM) to generate hypothetical documents that can be used to f
Externí odkaz:
http://arxiv.org/abs/2410.21242
Autor:
Holwerda, Benne W., Robertson, Clayton, Cook, Kyle, Pimbblet, Kevin A., Casura, Sarah, Sansom, Anne E., Patel, Divya, Butrum, Trevor, Glass, David H. W., Kelvin, Lee, Baldry, Ivan K., De Propris, Roberto, Bamford, Steven, Masters, Karen, Stone, Maria, Hardin, Tim, Walmsley, Mike, Liske, Jochen, Adnan, S M Rafee
Galaxy Zoo is an online project to classify morphological features in extra-galactic imaging surveys with public voting. In this paper, we compare the classifications made for two different surveys, the Dark Energy Spectroscopic Instrument (DESI) ima
Externí odkaz:
http://arxiv.org/abs/2410.19985
Knowledge Graphs (KGs) can serve as reliable knowledge sources for question answering (QA) due to their structured representation of knowledge. Existing research on the utilization of KG for large language models (LLMs) prevalently relies on subgraph
Externí odkaz:
http://arxiv.org/abs/2410.18415
Autor:
Mirza, M. Jehanzeb, Zhao, Mengjie, Mao, Zhuoyuan, Doveh, Sivan, Lin, Wei, Gavrikov, Paul, Dorkenwald, Michael, Yang, Shiqi, Jha, Saurav, Wakaki, Hiromi, Mitsufuji, Yuki, Possegger, Horst, Feris, Rogerio, Karlinsky, Leonid, Glass, James
In this work, we propose a novel method (GLOV) enabling Large Language Models (LLMs) to act as implicit Optimizers for Vision-Langugage Models (VLMs) to enhance downstream vision tasks. Our GLOV meta-prompts an LLM with the downstream task descriptio
Externí odkaz:
http://arxiv.org/abs/2410.06154