Výsledky vyhledávání - "Wang, Zhaokai"

Report

Mono-InternVL: Pushing the Boundaries of Monolithic Multimodal Large Language Models with Endogenous Visual Pre-training

Autor: Luo, Gen, Yang, Xue, Dou, Wenhan, Wang, Zhaokai, Dai, Jifeng, Qiao, Yu, Zhu, Xizhou

The rapid advancement of Large Language Models (LLMs) has led to an influx of efforts to extend their capabilities to multimodal tasks. Among them, growing attention has been focused on monolithic Multimodal Large Language Models (MLLMs) that integra

Externí odkaz: http://arxiv.org/abs/2410.08202

Zobrazit plný text záznamu

Report

Parameter-Inverted Image Pyramid Networks

Autor: Zhu, Xizhou, Yang, Xue, Wang, Zhaokai, Li, Hao, Dou, Wenhan, Ge, Junqi, Lu, Lewei, Qiao, Yu, Dai, Jifeng

Image pyramids are commonly used in modern computer vision tasks to obtain multi-scale features for precise understanding of images. However, image pyramids process multiple resolutions of images using the same large-scale model, which requires signi

Externí odkaz: http://arxiv.org/abs/2406.04330

Zobrazit plný text záznamu

Report

ITINERA: Integrating Spatial Optimization with Large Language Models for Open-domain Urban Itinerary Planning

Autor: Tang, Yihong, Wang, Zhaokai, Qu, Ao, Yan, Yihao, Wu, Zhaofeng, Zhuang, Dingyi, Kai, Jushi, Hou, Kebing, Guo, Xiaotong, Zhao, Jinhua, Zhao, Zhan, Ma, Wei

Citywalk, a recently popular form of urban travel, requires genuine personalization and understanding of fine-grained requests compared to traditional itinerary planning. In this paper, we introduce the novel task of Open-domain Urban Itinerary Plann

Externí odkaz: http://arxiv.org/abs/2402.07204

Zobrazit plný text záznamu

Report

Auto MC-Reward: Automated Dense Reward Design with Large Language Models for Minecraft

Autor: Li, Hao, Yang, Xue, Wang, Zhaokai, Zhu, Xizhou, Zhou, Jie, Qiao, Yu, Wang, Xiaogang, Li, Hongsheng, Lu, Lewei, Dai, Jifeng

Many reinforcement learning environments (e.g., Minecraft) provide only sparse rewards that indicate task completion or failure with binary values. The challenge in exploration efficiency in such environments makes it difficult for reinforcement-lear

Externí odkaz: http://arxiv.org/abs/2312.09238

Zobrazit plný text záznamu

Report

Video Background Music Generation: Dataset, Method and Evaluation

Autor: Zhuo, Le, Wang, Zhaokai, Wang, Baisen, Liao, Yue, Bao, Chenxi, Peng, Stanley, Han, Songhao, Zhang, Aixi, Fang, Fei, Liu, Si

Music is essential when editing videos, but selecting music manually is difficult and time-consuming. Thus, we seek to automatically generate background music tracks given video input. This is a challenging task since it requires music-video datasets

Externí odkaz: http://arxiv.org/abs/2211.11248

Zobrazit plný text záznamu

Akademický článek

Glutamine maintains the stability of alveolar structure and function after lung transplantation by inhibiting autophagy

Autor: Tan, Jun, Wang, Zhaokai, Huang, Zhihong, Huang, Ai, Zhang, Huan, Huang, Lei, Song, Naicheng, Xin, Gaojie, Jiang, Ke, Sun, Xiangfu

Publikováno v: In Biochemical and Biophysical Research Communications 1 October 2024 727

Zobrazit plný text záznamu

Report

Video Background Music Generation with Controllable Music Transformer

Autor: Di, Shangzhe, Jiang, Zeren, Liu, Si, Wang, Zhaokai, Zhu, Leyan, He, Zexin, Liu, Hongming, Yan, Shuicheng

In this work, we address the task of video background music generation. Some previous works achieve effective music generation but are unable to generate melodious music tailored to a particular video, and none of them considers the video-music rhyth

Externí odkaz: http://arxiv.org/abs/2111.08380

Zobrazit plný text záznamu

Akademický článek

Efficacy, safety, and prognostic modeling in neoadjuvant immunotherapy for esophageal squamous cell carcinoma

Autor: Song, Naicheng, Wang, Zhaokai, Sun, Quanchao, Xin, Gaojie, Yao, Zuhuan, Huang, Ai, Xing, Shijie, Qu, Yue, Zhang, Huan, Huang, Zhihong, Liao, Yongde, Jiang, Ke

Publikováno v: In International Immunopharmacology 5 December 2024 142 Part A

Zobrazit plný text záznamu

Akademický článek

Optimal strategies for assigning prior boundary settings in Hydraulic Tomography analysis

Autor: Su, Xiaoru, Yeh, Tian-Chyi Jim, Li, Kuangjia, Wang, Guangcai, Wang, Zhaokai

Publikováno v: In Advances in Water Resources April 2024 186

Zobrazit plný text záznamu

Report

Confidence-aware Non-repetitive Multimodal Transformers for TextCaps

Autor: Wang, Zhaokai, Bao, Renda, Wu, Qi, Liu, Si

When describing an image, reading text in the visual scene is crucial to understand the key information. Recent work explores the TextCaps task, i.e. image captioning with reading Optical Character Recognition (OCR) tokens, which requires models to r

Externí odkaz: http://arxiv.org/abs/2012.03662

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání