CLIP-Llama: A New Approach for Scene Text Recognition with a Pre-Trained Vision-Language Model and a Pre-Trained Language Model.

Autor: Zhao, Xiaoqing1 (AUTHOR), Xu, Miaomiao1 (AUTHOR), Silamu, Wushour1 (AUTHOR), Li, Yanbing1 (AUTHOR) liyb@xju.edu.cn
Zdroj: Sensors (14248220). Nov2024, Vol. 24 Issue 22, p7371. 14p.
Databáze: Academic Search Ultimate
Nepřihlášeným uživatelům se plný text nezobrazuje