Zobrazeno 1 - 10
of 2 159
pro vyhledávání: '"I.2.10"'
We propose a Ground IoU (Gr-IoU) to address the data association problem in multi-object tracking. When tracking objects detected by a camera, it often occurs that the same object is assigned different IDs in consecutive frames, especially when objec
Externí odkaz:
http://arxiv.org/abs/2409.03252
Recently, many self-supervised learning methods for image reconstruction have been proposed that can learn from noisy data alone, bypassing the need for ground-truth references. Most existing methods cluster around two classes: i) Noise2Self and simi
Externí odkaz:
http://arxiv.org/abs/2409.01985
Traversability assessment of deformable terrain is vital for safe rover navigation on planetary surfaces. Machine learning (ML) is a powerful tool for traversability prediction but faces predictive uncertainty. This uncertainty leads to prediction er
Externí odkaz:
http://arxiv.org/abs/2409.00641
Scaling laws dictate that the performance of AI models is proportional to the amount of available data. Data augmentation is a promising solution to expanding the dataset size. Traditional approaches focused on augmentation using rotation, translatio
Externí odkaz:
http://arxiv.org/abs/2409.00547
In recent years, several models using Quaternion-Valued Convolutional Neural Networks (QCNNs) for different problems have been proposed. Although the definition of the quaternion convolution layer is the same, there are different adaptations of other
Externí odkaz:
http://arxiv.org/abs/2409.00140
Autor:
Pathania, Kshitij
In the realm of image synthesis, achieving fidelity to a reference image while adhering to conditional prompts remains a significant challenge. This paper proposes a novel approach that integrates a diffusion model with latent space manipulation and
Externí odkaz:
http://arxiv.org/abs/2408.16232
Extreme Multimodal Summarization with Multimodal Output (XMSMO) becomes an attractive summarization approach by integrating various types of information to create extremely concise yet informative summaries for individual modalities. Existing methods
Externí odkaz:
http://arxiv.org/abs/2408.15829
Autor:
Yang, Dingyi, Jin, Qin
With the development of artificial intelligence, particularly the success of Large Language Models (LLMs), the quantity and quality of automatically generated stories have significantly increased. This has led to the need for automatic story evaluati
Externí odkaz:
http://arxiv.org/abs/2408.14622
Autor:
Chudasama, Vishal, Sarkar, Hiran, Wasnik, Pankaj, Balasubramanian, Vineeth N, Kalla, Jayateja
Object detection is a critical field in computer vision focusing on accurately identifying and locating specific objects in images or videos. Traditional methods for object detection rely on large labeled training datasets for each object category, w
Externí odkaz:
http://arxiv.org/abs/2408.14249
Abundant, well-annotated multimodal data in remote sensing are pivotal for aligning complex visual remote sensing (RS) scenes with human language, enabling the development of specialized vision language models across diverse RS interpretation tasks.
Externí odkaz:
http://arxiv.org/abs/2408.14744