Výsledky vyhledávání

Report

Gr-IoU: Ground-Intersection over Union for Robust Multi-Object Tracking with 3D Geometric Constraints

Autor: Toida, Keisuke, Kato, Naoki, Segawa, Osamu, Nakamura, Takeshi, Hotta, Kazuhiro

We propose a Ground IoU (Gr-IoU) to address the data association problem in multi-object tracking. When tracking objects detected by a camera, it often occurs that the same object is assigned different IDs in consecutive frames, especially when objec

Externí odkaz: http://arxiv.org/abs/2409.03252

Zobrazit plný text záznamu

Report

UNSURE: Unknown Noise level Stein's Unbiased Risk Estimator

Autor: Tachella, Julián, Davies, Mike, Jacques, Laurent

Recently, many self-supervised learning methods for image reconstruction have been proposed that can learn from noisy data alone, bypassing the need for ground-truth references. Most existing methods cluster around two classes: i) Noise2Self and simi

Externí odkaz: http://arxiv.org/abs/2409.01985

Zobrazit plný text záznamu

Report

Deep Probabilistic Traversability with Test-time Adaptation for Uncertainty-aware Planetary Rover Navigation

Autor: Endo, Masafumi, Taniai, Tatsunori, Ishigami, Genya

Traversability assessment of deformable terrain is vital for safe rover navigation on planetary surfaces. Machine learning (ML) is a powerful tool for traversability prediction but faces predictive uncertainty. This uncertainty leads to prediction er

Externí odkaz: http://arxiv.org/abs/2409.00641

Zobrazit plný text záznamu

Report

Data Augmentation for Image Classification using Generative AI

Autor: Rahat, Fazle, Hossain, M Shifat, Ahmed, Md Rubel, Jha, Sumit Kumar, Ewetz, Rickard

Scaling laws dictate that the performance of AI models is proportional to the amount of available data. Data augmentation is a promising solution to expanding the dataset size. Traditional approaches focused on augmentation using rotation, translatio

Externí odkaz: http://arxiv.org/abs/2409.00547

Zobrazit plný text záznamu

Report

Statistical Analysis of the Impact of Quaternion Components in Convolutional Neural Networks

Autor: Altamirano-Gómez, Gerardo, Gershenson, Carlos

In recent years, several models using Quaternion-Valued Convolutional Neural Networks (QCNNs) for different problems have been proposed. Although the definition of the quaternion convolution layer is the same, there are different adaptations of other

Externí odkaz: http://arxiv.org/abs/2409.00140

Zobrazit plný text záznamu

Report

Enhancing Conditional Image Generation with Explainable Latent Space Manipulation

Autor: Pathania, Kshitij

In the realm of image synthesis, achieving fidelity to a reference image while adhering to conditional prompts remains a significant challenge. This paper proposes a novel approach that integrates a diffusion model with latent space manipulation and

Externí odkaz: http://arxiv.org/abs/2408.16232

Zobrazit plný text záznamu

Report

SITransformer: Shared Information-Guided Transformer for Extreme Multimodal Summarization

Autor: Liu, Sicheng, Wang, Lintao, Zhu, Xiaogan, Lu, Xuequan, Wang, Zhiyong, Hu, Kun

Extreme Multimodal Summarization with Multimodal Output (XMSMO) becomes an attractive summarization approach by integrating various types of information to create extremely concise yet informative summaries for individual modalities. Existing methods

Externí odkaz: http://arxiv.org/abs/2408.15829

Zobrazit plný text záznamu

Report

What Makes a Good Story and How Can We Measure It? A Comprehensive Survey of Story Evaluation

Autor: Yang, Dingyi, Jin, Qin

With the development of artificial intelligence, particularly the success of Large Language Models (LLMs), the quantity and quality of automatically generated stories have significantly increased. This has led to the need for automatic story evaluati

Externí odkaz: http://arxiv.org/abs/2408.14622

Zobrazit plný text záznamu

Report

Beyond Few-shot Object Detection: A Detailed Survey

Autor: Chudasama, Vishal, Sarkar, Hiran, Wasnik, Pankaj, Balasubramanian, Vineeth N, Kalla, Jayateja

Object detection is a critical field in computer vision focusing on accurately identifying and locating specific objects in images or videos. Traditional methods for object detection rely on large labeled training datasets for each object category, w

Externí odkaz: http://arxiv.org/abs/2408.14249

Zobrazit plný text záznamu

Report

RSTeller: Scaling Up Visual Language Modeling in Remote Sensing with Rich Linguistic Semantics from Openly Available Data and Large Language Models

Autor: Ge, Junyao, Zheng, Yang, Guo, Kaitai, Liang, Jimin

Abundant, well-annotated multimodal data in remote sensing are pivotal for aligning complex visual remote sensing (RS) scenes with human language, enabling the development of specialized vision language models across diverse RS interpretation tasks.

Externí odkaz: http://arxiv.org/abs/2408.14744

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání