Search Results - "Liu, Zhaoyang"

Report

VisionLLM v2: An End-to-End Generalist Multimodal Large Language Model for Hundreds of Vision-Language Tasks

Author/Creator: Wu, Jiannan, Zhong, Muyan, Xing, Sen, Lai, Zeqiang, Liu, Zhaoyang, Wang, Wenhai, Chen, Zhe, Zhu, Xizhou, Lu, Lewei, Lu, Tong, Luo, Ping, Qiao, Yu, Dai, Jifeng

We present VisionLLM v2, an end-to-end generalist multimodal large model (MLLM) that unifies visual perception, understanding, and generation within a single framework. Unlike traditional MLLMs limited to text output, VisionLLM v2 significantly broad

Externí odkaz: http://arxiv.org/abs/2406.08394

Zobrazit plný text záznamu

Report

VidMuse: A Simple Video-to-Music Generation Framework with Long-Short-Term Modeling

Author/Creator: Tian, Zeyue, Liu, Zhaoyang, Yuan, Ruibin, Pan, Jiahao, Huang, Xiaoqiang, Liu, Qifeng, Tan, Xu, Chen, Qifeng, Xue, Wei, Guo, Yike

In this work, we systematically study music generation conditioned solely on the video. First, we present a large-scale dataset comprising 190K video-music pairs, including various genres such as movie trailers, advertisements, and documentaries. Fur

Externí odkaz: http://arxiv.org/abs/2406.04321

Zobrazit plný text záznamu

Report

LLMs Meet Multimodal Generation and Editing: A Survey

Author/Creator: He, Yingqing, Liu, Zhaoyang, Chen, Jingye, Tian, Zeyue, Liu, Hongyu, Chi, Xiaowei, Liu, Runtao, Yuan, Ruibin, Xing, Yazhou, Wang, Wenhai, Dai, Jifeng, Zhang, Yong, Xue, Wei, Liu, Qifeng, Guo, Yike, Chen, Qifeng

With the recent advancement in large language models (LLMs), there is a growing interest in combining LLMs with multimodal learning. Previous surveys of multimodal large language models (MLLMs) mainly focus on multimodal understanding. This survey el

Externí odkaz: http://arxiv.org/abs/2405.19334

Zobrazit plný text záznamu

Report

Paths of A Million People: Extracting Life Trajectories from Wikipedia

Author/Creator: Zhang, Ying, Li, Xiaofeng, Liu, Zhaoyang, Zhang, Haipeng

Notable people's life trajectories have been a focus of study -- the locations and times of various activities, such as birth, death, education, marriage, competition, work, delivering a speech, making a scientific discovery, finishing a masterpiece,

Externí odkaz: http://arxiv.org/abs/2406.00032

Zobrazit plný text záznamu

Report

MASTER: Market-Guided Stock Transformer for Stock Price Forecasting

Author/Creator: Li, Tong, Liu, Zhaoyang, Shen, Yanyan, Wang, Xue, Chen, Haokun, Huang, Sen

Stock price forecasting has remained an extremely challenging problem for many decades due to the high volatility of the stock market. Recent efforts have been devoted to modeling complex stock correlations toward joint stock price forecasting. Exist

Externí odkaz: http://arxiv.org/abs/2312.15235

Zobrazit plný text záznamu

Report

Linear Gaussian Bounding Box Representation and Ring-Shaped Rotated Convolution for Oriented Object Detection

Author/Creator: Zhou, Zhen, Ma, Yunkai, Fan, Junfeng, Liu, Zhaoyang, Jing, Fengshui, Tan, Min

Published in: Pattern Recognition 2024

In oriented object detection, current representations of oriented bounding boxes (OBBs) often suffer from boundary discontinuity problem. Methods of designing continuous regression losses do not essentially solve this problem. Although Gaussian bound

Externí odkaz: http://arxiv.org/abs/2311.05410

Zobrazit plný text záznamu

Report

ControlLLM: Augment Language Models with Tools by Searching on Graphs

Author/Creator: Liu, Zhaoyang, Lai, Zeqiang, Gao, Zhangwei, Cui, Erfei, Li, Ziheng, Zhu, Xizhou, Lu, Lewei, Chen, Qifeng, Qiao, Yu, Dai, Jifeng, Wang, Wenhai

We present ControlLLM, a novel framework that enables large language models (LLMs) to utilize multi-modal tools for solving complex real-world tasks. Despite the remarkable performance of LLMs, they still struggle with tool invocation due to ambiguou

Externí odkaz: http://arxiv.org/abs/2310.17796

Zobrazit plný text záznamu

Report

Data-Juicer: A One-Stop Data Processing System for Large Language Models

Author/Creator: Chen, Daoyuan, Huang, Yilun, Ma, Zhijian, Chen, Hesen, Pan, Xuchen, Ge, Ce, Gao, Dawei, Xie, Yuexiang, Liu, Zhaoyang, Gao, Jinyang, Li, Yaliang, Ding, Bolin, Zhou, Jingren

The immense evolution in Large Language Models (LLMs) has underscored the importance of massive, heterogeneous, and high-quality data. A data recipe is a mixture of data from different sources for training LLMs, which plays a vital role in LLMs' perf

Externí odkaz: http://arxiv.org/abs/2309.02033

Zobrazit plný text záznamu

Report

InternGPT: Solving Vision-Centric Tasks by Interacting with ChatGPT Beyond Language

Author/Creator: Liu, Zhaoyang, He, Yinan, Wang, Wenhai, Wang, Weiyun, Wang, Yi, Chen, Shoufa, Zhang, Qinglong, Lai, Zeqiang, Yang, Yang, Li, Qingyun, Yu, Jiashuo, Li, Kunchang, Chen, Zhe, Yang, Xue, Zhu, Xizhou, Wang, Yali, Wang, Limin, Luo, Ping, Dai, Jifeng, Qiao, Yu

We present an interactive visual framework named InternGPT, or iGPT for short. The framework integrates chatbots that have planning and reasoning capabilities, such as ChatGPT, with non-verbal instructions like pointing movements that enable users to

Externí odkaz: http://arxiv.org/abs/2305.05662

Zobrazit plný text záznamu

Report

Bayesian Criterion for Re-randomization

Author/Creator: Liu, Zhaoyang, Han, Tingxuan, Rubin, Donald B., Deng, Ke

Re-randomization has gained popularity as a tool for experiment-based causal inference due to its superior covariate balance and statistical efficiency compared to classic randomized experiments. However, the basic re-randomization method, known as R

Externí odkaz: http://arxiv.org/abs/2303.07904

Zobrazit plný text záznamu

Search Tools:

Refine Results