Výsledky vyhledávání - "Zhou, Junjie"

Report

Video-XL: Extra-Long Vision Language Model for Hour-Scale Video Understanding

Autor: Shu, Yan, Zhang, Peitian, Liu, Zheng, Qin, Minghao, Zhou, Junjie, Huang, Tiejun, Zhao, Bo

Although current Multi-modal Large Language Models (MLLMs) demonstrate promising results in video understanding, processing extremely long videos remains an ongoing challenge. Typically, MLLMs struggle with handling thousands of visual tokens that ex

Externí odkaz: http://arxiv.org/abs/2409.14485

Zobrazit plný text záznamu

Report

Sequential Network Design

Autor: Sun, Yang, Zhao, Wei, Zhou, Junjie

We study dynamic network formation from a centralized perspective. In each period, the social planner builds a single link to connect previously unlinked pairs. The social planner is forward-looking, with instantaneous utility monotonic in the aggreg

Externí odkaz: http://arxiv.org/abs/2409.14136

Zobrazit plný text záznamu

Report

OmniGen: Unified Image Generation

Autor: Xiao, Shitao, Wang, Yueze, Zhou, Junjie, Yuan, Huaying, Xing, Xingrun, Yan, Ruiran, Wang, Shuting, Huang, Tiejun, Liu, Zheng

In this work, we introduce OmniGen, a new diffusion model for unified image generation. Unlike popular diffusion models (e.g., Stable Diffusion), OmniGen no longer requires additional modules such as ControlNet or IP-Adapter to process diverse contro

Externí odkaz: http://arxiv.org/abs/2409.11340

Zobrazit plný text záznamu

Report

VISTA: Visualized Text Embedding For Universal Multi-Modal Retrieval

Autor: Zhou, Junjie, Liu, Zheng, Xiao, Shitao, Zhao, Bo, Xiong, Yongping

Multi-modal retrieval becomes increasingly popular in practice. However, the existing retrievers are mostly text-oriented, which lack the capability to process visual information. Despite the presence of vision-language models like CLIP, the current

Externí odkaz: http://arxiv.org/abs/2406.04292

Zobrazit plný text záznamu

Report

MLVU: A Comprehensive Benchmark for Multi-Task Long Video Understanding

Autor: Zhou, Junjie, Shu, Yan, Zhao, Bo, Wu, Boya, Xiao, Shitao, Yang, Xi, Xiong, Yongping, Zhang, Bo, Huang, Tiejun, Liu, Zheng

The evaluation of Long Video Understanding (LVU) performance poses an important but challenging research problem. Despite previous efforts, the existing video understanding benchmarks are severely constrained by several issues, especially the insuffi

Externí odkaz: http://arxiv.org/abs/2406.04264

Zobrazit plný text záznamu

Report

TextDiff: Mask-Guided Residual Diffusion Models for Scene Text Image Super-Resolution

Autor: Liu, Baolin, Yang, Zongyuan, Wang, Pengfei, Zhou, Junjie, Liu, Ziqi, Song, Ziyi, Liu, Yan, Xiong, Yongping

The goal of scene text image super-resolution is to reconstruct high-resolution text-line images from unrecognizable low-resolution inputs. The existing methods relying on the optimization of pixel-level loss tend to yield text edges that exhibit a n

Externí odkaz: http://arxiv.org/abs/2308.06743

Zobrazit plný text záznamu

Report

DocDiff: Document Enhancement via Residual Diffusion Models

Autor: Yang, Zongyuan, Liu, Baolin, Xiong, Yongping, Yi, Lan, Wu, Guibin, Tang, Xiaojun, Liu, Ziqi, Zhou, Junjie, Zhang, Xing

Removing degradation from document images not only improves their visual quality and readability, but also enhances the performance of numerous automated document analysis and recognition tasks. However, existing regression-based methods optimized fo

Externí odkaz: http://arxiv.org/abs/2305.03892

Zobrazit plný text záznamu

Report

Effort Discrimination and Curvature of Contest Technology in Conflict Networks

Autor: Sun, Xiang, Xu, Jin, Zhou, Junjie

In a model of interconnected conflicts on a network, we compare the equilibrium effort profiles and payoffs under two scenarios: uniform effort (UE) in which each contestant is restricted to exert the same effort across all the battles she participat

Externí odkaz: http://arxiv.org/abs/2302.09861

Zobrazit plný text záznamu

Report

SAT: Size-Aware Transformer for 3D Point Cloud Semantic Segmentation

Autor: Zhou, Junjie, Xiong, Yongping, Chiu, Chinwai, Liu, Fangyu, Gong, Xiangyang

Transformer models have achieved promising performances in point cloud segmentation. However, most existing attention schemes provide the same feature learning paradigm for all points equally and overlook the enormous difference in size among scene o

Externí odkaz: http://arxiv.org/abs/2301.06869

Zobrazit plný text záznamu

Report

Welfare and Distributional Effects of Joint Intervention in Networks

Autor: Kor, Ryan, Zhou, Junjie

We study a planner's optimal interventions in both the standalone marginal utilities of players on a network and weights on the links that connect players. The welfare-maximizing joint intervention exhibits the following properties: (a) when the plan

Externí odkaz: http://arxiv.org/abs/2206.03863

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání