Výsledky vyhledávání

Report

Deadline and Priority Constrained Immersive Video Streaming Transmission Scheduling

Autor: Feng, Tongtong, Qi, Qi, He, Bo, Wang, Jingyu

Deadline-aware transmission scheduling in immersive video streaming is crucial. The objective is to guarantee that at least a certain block in multi-links is fully delivered within their deadlines, which is referred to as delivery ratio. Compared wit

Externí odkaz: http://arxiv.org/abs/2408.17028

Zobrazit plný text záznamu

Report

UMono: Physical Model Informed Hybrid CNN-Transformer Framework for Underwater Monocular Depth Estimation

Autor: Wang, Jian, Wang, Jing, Rong, Shenghui, He, Bo

Underwater monocular depth estimation serves as the foundation for tasks such as 3D reconstruction of underwater scenes. However, due to the influence of light and medium, the underwater environment undergoes a distinctive imaging process, which pres

Externí odkaz: http://arxiv.org/abs/2407.17838

Zobrazit plný text záznamu

Report

MA-LMM: Memory-Augmented Large Multimodal Model for Long-Term Video Understanding

Autor: He, Bo, Li, Hengduo, Jang, Young Kyun, Jia, Menglin, Cao, Xuefei, Shah, Ashish, Shrivastava, Abhinav, Lim, Ser-Nam

With the success of large language models (LLMs), integrating the vision model into LLMs to build vision-language foundation models has gained much more interest recently. However, existing LLM-based large multimodal models (e.g., Video-LLaMA, VideoC

Externí odkaz: http://arxiv.org/abs/2404.05726

Zobrazit plný text záznamu

Report

OmniVid: A Generative Framework for Universal Video Understanding

Autor: Wang, Junke, Chen, Dongdong, Luo, Chong, He, Bo, Yuan, Lu, Wu, Zuxuan, Jiang, Yu-Gang

The core of video understanding tasks, such as recognition, captioning, and tracking, is to automatically detect objects or actions in a video and analyze their temporal evolution. Despite sharing a common goal, different tasks often rely on distinct

Externí odkaz: http://arxiv.org/abs/2403.17935

Zobrazit plný text záznamu

Report

To See is to Believe: Prompting GPT-4V for Better Visual Instruction Tuning

Autor: Wang, Junke, Meng, Lingchen, Weng, Zejia, He, Bo, Wu, Zuxuan, Jiang, Yu-Gang

Existing visual instruction tuning methods typically prompt large language models with textual descriptions to generate instruction-following data. Despite the promising performance achieved, these descriptions are derived from image annotations, whi

Externí odkaz: http://arxiv.org/abs/2311.07574

Zobrazit plný text záznamu

Report

Chop & Learn: Recognizing and Generating Object-State Compositions

Autor: Saini, Nirat, Wang, Hanyu, Swaminathan, Archana, Jayasundara, Vinoj, He, Bo, Gupta, Kamal, Shrivastava, Abhinav

Recognizing and generating object-state compositions has been a challenging task, especially when generalizing to unseen compositions. In this paper, we study the task of cutting objects in different styles and the resulting object state changes. We

Externí odkaz: http://arxiv.org/abs/2309.14339

Zobrazit plný text záznamu

Akademický článek

Effect of laser remelting on surface morphology and mechanical properties of laser deposition manufactured thin-walled Ti-6Al-4V alloy

Autor: He, Bo, Tan, Jian, Yang, Guang, Yi, Junzhen, Wang, Yushi

Publikováno v: Rapid Prototyping Journal, 2024, Vol. 30, Issue 8, pp. 1638-1647.

Externí odkaz: http://www.emeraldinsight.com/doi/10.1108/RPJ-02-2023-0052

Zobrazit plný text záznamu

Akademický článek

Novel step-down topologies of star-connected autotransformer

Autor: Wang, JiaRong, He, Bo, Chen, XiaoQiang

Publikováno v: Circuit World, 2021, Vol. 50, Issue 2/3, pp. 225-239.

Externí odkaz: http://www.emeraldinsight.com/doi/10.1108/CW-11-2019-0159

Zobrazit plný text záznamu

Report

Towards Scalable Neural Representation for Diverse Videos

Autor: He, Bo, Yang, Xitong, Wang, Hanyu, Wu, Zuxuan, Chen, Hao, Huang, Shuaiyi, Ren, Yixuan, Lim, Ser-Nam, Shrivastava, Abhinav

Implicit neural representations (INR) have gained increasing attention in representing 3D scenes and images, and have been recently applied to encode videos (e.g., NeRV, E-NeRV). While achieving promising results, existing INR-based methods are limit

Externí odkaz: http://arxiv.org/abs/2303.14124

Zobrazit plný text záznamu

Report

Align and Attend: Multimodal Summarization with Dual Contrastive Losses

Autor: He, Bo, Wang, Jun, Qiu, Jielin, Bui, Trung, Shrivastava, Abhinav, Wang, Zhaowen

The goal of multimodal summarization is to extract the most important information from different modalities to form output summaries. Unlike the unimodal summarization, the multimodal summarization task explicitly leverages cross-modal information to

Externí odkaz: http://arxiv.org/abs/2303.07284

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání