Zobrazeno 1 - 5
of 5
pro vyhledávání: '"Ma, Zhiyin"'
While large multi-modal models (LMM) have shown notable progress in multi-modal tasks, their capabilities in tasks involving dense textual content remains to be fully explored. Dense text, which carries important information, is often found in docume
Externí odkaz:
http://arxiv.org/abs/2405.06706
We present TextMonkey, a large multimodal model (LMM) tailored for text-centric tasks. Our approach introduces enhancement across several dimensions: By adopting Shifted Window Attention with zero-initialization, we achieve cross-window connectivity
Externí odkaz:
http://arxiv.org/abs/2403.04473
Autor:
Li, Zhang, Yang, Biao, Liu, Qiang, Ma, Zhiyin, Zhang, Shuo, Yang, Jingxu, Sun, Yabo, Liu, Yuliang, Bai, Xiang
Large Multimodal Models (LMMs) have shown promise in vision-language tasks but struggle with high-resolution input and detailed scene understanding. Addressing these challenges, we introduce Monkey to enhance LMM capabilities. Firstly, Monkey process
Externí odkaz:
http://arxiv.org/abs/2311.06607
Akademický článek
Tento výsledek nelze pro nepřihlášené uživatele zobrazit.
K zobrazení výsledku je třeba se přihlásit.
K zobrazení výsledku je třeba se přihlásit.
Autor:
Ma, Fangyuan1 (AUTHOR), Dai, Shujuan2 (AUTHOR) shujuandai@163.com, Tao, Dongping2 (AUTHOR), Tao, Youjun1 (AUTHOR) 2576101555@qq.com, Ma, Zhiyin3 (AUTHOR)
Publikováno v:
Energy Sources Part A: Recovery, Utilization & Environmental Effects. 2021, Vol. 43 Issue 10, p1151-1161. 11p.