Zobrazeno 1 - 10
of 23 392
pro vyhledávání: '"P. A. ro"'
Recent works on Generalized Referring Expression Segmentation (GRES) struggle with handling complex expressions referring to multiple distinct objects. This is because these methods typically employ an end-to-end foreground-background segmentation an
Externí odkaz:
http://arxiv.org/abs/2411.15087
Autor:
Lee, Chang-Gi, Chae, Byeong-Gyu, Ro, I-Jun, Jang, Kyuseon, Woods, Eric, Ahn, Jaemin, Park, Seong Yong, Gault, Baptiste, Kim, Se-Ho
Atom probe tomography (APT) enables near atomic scale three dimensional elemental mapping through the controlled field evaporation of surface atoms triggered by the combined application of a DC voltage and either voltage or laser pulses. As the selec
Externí odkaz:
http://arxiv.org/abs/2411.10506
Autor:
Cai, Ruisi, Ro, Yeonju, Kim, Geon-Woo, Wang, Peihao, Bejnordi, Babak Ehteshami, Akella, Aditya, Wang, Zhangyang
The proliferation of large language models (LLMs) has led to the adoption of Mixture-of-Experts (MoE) architectures that dynamically leverage specialized subnetworks for improved efficiency and performance. Despite their benefits, MoE models face sig
Externí odkaz:
http://arxiv.org/abs/2410.19123
Autor:
De Ro, Joeri
Given a locally compact quantum group and two $\mathbb{G}$-$W^*$-algebras $\alpha: A\curvearrowleft \mathbb{G}$ and $\beta: B\curvearrowleft \mathbb{G}$, we study the notion of equivariant $W^*$-Morita equivalence $(A, \alpha)\sim_{\mathbb{G}} (B, \b
Externí odkaz:
http://arxiv.org/abs/2410.17407
In-context learning (ICL) is a powerful paradigm where large language models (LLMs) benefit from task demonstrations added to the prompt. Yet, selecting optimal demonstrations is not trivial, especially for complex or multi-modal tasks where input an
Externí odkaz:
http://arxiv.org/abs/2410.14049
We further explore the notion of Ulam words considered by Bade, Cui, Labelle, and Li. We find that when interpreted as integers in a natural way, Ulam words appear to follow a new, unexplained distribution. Gaps between words and words of special typ
Externí odkaz:
http://arxiv.org/abs/2410.01217
The success of visual instruction tuning has accelerated the development of large language and vision models (LLVMs). Following the scaling laws of instruction-tuned large language models (LLMs), LLVMs either have further increased their sizes, reach
Externí odkaz:
http://arxiv.org/abs/2409.14713
Autor:
Yeo, Jeong Hun, Kim, Chae Won, Kim, Hyunjun, Rha, Hyeongseop, Han, Seunghee, Cheng, Wen-Huang, Ro, Yong Man
Lip reading aims to predict spoken language by analyzing lip movements. Despite advancements in lip reading technologies, performance degrades when models are applied to unseen speakers due to their sensitivity to variations in visual information suc
Externí odkaz:
http://arxiv.org/abs/2409.00986
SPARK: Multi-Vision Sensor Perception and Reasoning Benchmark for Large-scale Vision-Language Models
Large-scale Vision-Language Models (LVLMs) have significantly advanced with text-aligned vision inputs. They have made remarkable progress in computer vision tasks by aligning text modality with vision inputs. There are also endeavors to incorporate
Externí odkaz:
http://arxiv.org/abs/2408.12114
For Image Super-Resolution (SR), it is common to train and evaluate scale-specific models composed of an encoder and upsampler for each targeted scale. Consequently, many SR studies encounter substantial training times and complex deployment requirem
Externí odkaz:
http://arxiv.org/abs/2408.09674