Výsledky vyhledávání - "Xian, Xingyuan"

Report

Do Current Video LLMs Have Strong OCR Abilities? A Preliminary Study

Autor: Fei, Yulin, Gao, Yuhui, Xian, Xingyuan, Zhang, Xiaojin, Wu, Tao, Chen, Wei

With the rise of multimodal large language models, accurately extracting and understanding textual information from video content, referred to as video based optical character recognition (Video OCR), has become a crucial capability. This paper intro

Externí odkaz: http://arxiv.org/abs/2412.20613

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání