Zobrazeno 1 - 10
of 35 465
pro vyhledávání: '"A Suri"'
Recent advancements in vision-language models (VLMs) offer potential for robot task planning, but challenges remain due to VLMs' tendency to generate incorrect action sequences. To address these limitations, we propose VeriGraph, a novel framework th
Externí odkaz:
http://arxiv.org/abs/2411.10446
We present LARP, a novel video tokenizer designed to overcome limitations in current video tokenization methods for autoregressive (AR) generative models. Unlike traditional patchwise tokenizers that directly encode local visual patches into discrete
Externí odkaz:
http://arxiv.org/abs/2410.21264
Autor:
Parmar, Vivek, Bane, Dwijay, Sarwar, Syed Shakib, Stangherlin, Kleber, De Salvo, Barbara, Suri, Manan
With the emergence of the Metaverse and focus on wearable devices in the recent years gesture based human-computer interaction has gained significance. To enable gesture recognition for VR/AR headsets and glasses several datasets focusing on egocentr
Externí odkaz:
http://arxiv.org/abs/2410.19486
Autor:
Hossain, Jumman, Dey, Emon, Chugh, Snehalraj, Ahmed, Masud, Anwar, MS, Faridee, Abu-Zaher, Hoppes, Jason, Trout, Theron, Basak, Anjon, Chowdhury, Rafidh, Mistry, Rishabh, Kim, Hyun, Freeman, Jade, Suri, Niranjan, Raglin, Adrienne, Busart, Carl, Gregory, Timothy, Ravi, Anuradha, Roy, Nirmalya
The increasing deployment of autonomous systems in complex environments necessitates efficient communication and task completion among multiple agents. This paper presents SERN (Simulation-Enhanced Realistic Navigation), a novel framework integrating
Externí odkaz:
http://arxiv.org/abs/2410.16686
Autor:
Suri, Manan, Mathur, Puneet, Dernoncourt, Franck, Jain, Rajiv, Morariu, Vlad I, Sawhney, Ramit, Nakov, Preslav, Manocha, Dinesh
Document structure editing involves manipulating localized textual, visual, and layout components in document images based on the user's requests. Past works have shown that multimodal grounding of user requests in the document image and identifying
Externí odkaz:
http://arxiv.org/abs/2410.16472
This technical report investigates the application of event-based vision sensors in non-invasive qualitative vibration analysis, with a particular focus on frequency measurement and motion magnification. Event cameras, with their high temporal resolu
Externí odkaz:
http://arxiv.org/abs/2410.14364
Autor:
Glockner, Helge, Suri, Ali
If G is a Lie group modeled on a Fr\'echet space, let e be its neutral element and g be its Lie algebra. We show that every strong ILB-Lie group G is L^1-regular in the sense that each f in L^1([0,1],g) is the right logarithmic derivative of some abs
Externí odkaz:
http://arxiv.org/abs/2410.02909
As Chile's electric power sector advances toward a future powered by renewable energy, accurate forecasting of renewable generation is essential for managing grid operations. The integration of renewable energy sources is particularly challenging due
Externí odkaz:
http://arxiv.org/abs/2409.09263
Autor:
Fradkin, Philip, Azadi, Puria, Suri, Karush, Wenkel, Frederik, Bashashati, Ali, Sypetkowski, Maciej, Beaini, Dominique
Predicting molecular impact on cellular function is a core challenge in therapeutic design. Phenomic experiments, designed to capture cellular morphology, utilize microscopy based techniques and demonstrate a high throughput solution for uncovering m
Externí odkaz:
http://arxiv.org/abs/2409.08302
In this article, we propose a new method of generating single microwave photons in superconducting circuits. We theoretically show that pure single microwave photons can be generated on demand and tuned over a large frequency band by making use of La
Externí odkaz:
http://arxiv.org/abs/2409.05117