Výsledky vyhledávání

Report

StreamMOTP: Streaming and Unified Framework for Joint 3D Multi-Object Tracking and Trajectory Prediction

Autor: Zhuang, Jiaheng, Wang, Guoan, Zhang, Siyu, Wang, Xiyang, Zhou, Hangning, Xu, Ziyao, Zhang, Chi, Li, Zhiheng

3D multi-object tracking and trajectory prediction are two crucial modules in autonomous driving systems. Generally, the two tasks are handled separately in traditional paradigms and a few methods have started to explore modeling these two tasks in a

Externí odkaz: http://arxiv.org/abs/2406.19844

Zobrazit plný text záznamu

Report

NeB-SLAM: Neural Blocks-based Salable RGB-D SLAM for Unknown Scenes

Autor: Bai, Lizhi, Tian, Chunqi, Yang, Jun, Zhang, Siyu, Liang, Weijian

Neural implicit representations have recently demonstrated considerable potential in the field of visual simultaneous localization and mapping (SLAM). This is due to their inherent advantages, including low storage overhead and representation continu

Externí odkaz: http://arxiv.org/abs/2405.15151

Zobrazit plný text záznamu

Report

SparseAD: Sparse Query-Centric Paradigm for Efficient End-to-End Autonomous Driving

Autor: Zhang, Diankun, Wang, Guoan, Zhu, Runwen, Zhao, Jianbo, Chen, Xiwu, Zhang, Siyu, Gong, Jiahao, Zhou, Qibin, Zhang, Wenyuan, Wang, Ningzi, Tan, Feiyang, Zhou, Hangning, Xu, Ziyao, Yao, Haotian, Zhang, Chi, Liu, Xiaojun, Di, Xiaoguang, Li, Bin

End-to-End paradigms use a unified framework to implement multi-tasks in an autonomous driving system. Despite simplicity and clarity, the performance of end-to-end autonomous driving methods on sub-tasks is still far behind the single-task methods.

Externí odkaz: http://arxiv.org/abs/2404.06892

Zobrazit plný text záznamu

Report

Fast and Interpretable 2D Homography Decomposition: Similarity-Kernel-Similarity and Affine-Core-Affine Transformations

Autor: Cai, Shen, Wu, Zhanhao, Guo, Lingxi, Wang, Jiachun, Zhang, Siyu, Yan, Junchi, Shen, Shuhan

In this paper, we present two fast and interpretable decomposition methods for 2D homography, which are named Similarity-Kernel-Similarity (SKS) and Affine-Core-Affine (ACA) transformations respectively. Under the minimal $4$-point configuration, the

Externí odkaz: http://arxiv.org/abs/2402.18008

Zobrazit plný text záznamu

Report

A Computationally Efficient Neural Video Compression Accelerator Based on a Sparse CNN-Transformer Hybrid Network

Autor: Zhang, Siyu, Mao, Wendong, Shi, Huihong, Wang, Zhongfeng

Video compression is widely used in digital television, surveillance systems, and virtual reality. Real-time video decoding is crucial in practical scenarios. Recently, neural video compression (NVC) combines traditional coding with deep learning, ac

Externí odkaz: http://arxiv.org/abs/2312.10716

Zobrazit plný text záznamu

Report

KNVQA: A Benchmark for evaluation knowledge-based VQA

Autor: Cheng, Sirui, Zhang, Siyu, Wu, Jiayi, Lan, Muchen

Within the multimodal field, large vision-language models (LVLMs) have made significant progress due to their strong perception and reasoning capabilities in the visual and language systems. However, LVLMs are still plagued by the two critical issues

Externí odkaz: http://arxiv.org/abs/2311.12639

Zobrazit plný text záznamu

Report

Multiscale Superpixel Structured Difference Graph Convolutional Network for VL Representation

Autor: Zhang, Siyu, Chen, Yeming, Cheng, Sirui, Sun, Yaoru, Yang, Jun, Bai, Lizhi

Within the multimodal field, the key to integrating vision and language lies in establishing a good alignment strategy. Recently, benefiting from the success of self-supervised learning, significant progress has been made in multimodal semantic repre

Externí odkaz: http://arxiv.org/abs/2310.13447

Zobrazit plný text záznamu

Report

Guided Cooperation in Hierarchical Reinforcement Learning via Model-based Rollout

Autor: Wang, Haoran, Tang, Zeshen, Yang, Leya, Sun, Yaoru, Wang, Fang, Zhang, Siyu, Chen, Yeming

Goal-conditioned hierarchical reinforcement learning (HRL) presents a promising approach for enabling effective exploration in complex, long-horizon reinforcement learning (RL) tasks through temporal abstraction. Empirically, heightened inter-level c

Externí odkaz: http://arxiv.org/abs/2309.13508

Zobrazit plný text záznamu

Report

Artificial-Spiking Hierarchical Networks for Vision-Language Representation Learning

Autor: Chen, Yeming, Zhang, Siyu, Sun, Yaoru, Liang, Weijian, Wang, Haoran

With the success of self-supervised learning, multimodal foundation models have rapidly adapted a wide range of downstream tasks driven by vision and language (VL) pretraining. State-of-the-art methods achieve impressive performance by pre-training o

Externí odkaz: http://arxiv.org/abs/2308.09455

Zobrazit plný text záznamu

Report

LittleMu: Deploying an Online Virtual Teaching Assistant via Heterogeneous Sources Integration and Chain of Teach Prompts

Autor: Tu, Shangqing, Zhang, Zheyuan, Yu, Jifan, Li, Chunyang, Zhang, Siyu, Yao, Zijun, Hou, Lei, Li, Juanzi

Teaching assistants have played essential roles in the long history of education. However, few MOOC platforms are providing human or virtual teaching assistants to support learning for massive online students due to the complexity of real-world onlin

Externí odkaz: http://arxiv.org/abs/2308.05935

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání