Výsledky vyhledávání

Report

How Control Information Influences Multilingual Text Image Generation and Editing?

Autor: Zhang, Boqiang, Gao, Zuan, Qu, Yadong, Xie, Hongtao

Visual text generation has significantly advanced through diffusion models aimed at producing images with readable and realistic text. Recent works primarily use a ControlNet-based framework, employing standard font text images to control diffusion m

Externí odkaz: http://arxiv.org/abs/2407.11502

Zobrazit plný text záznamu

Report

Self-Supervised Pre-training with Symmetric Superimposition Modeling for Scene Text Recognition

Autor: Gao, Zuan, Wang, Yuxin, Qu, Yadong, Zhang, Boqiang, Wang, Zixiao, Xu, Jianjun, Xie, Hongtao

In text recognition, self-supervised pre-training emerges as a good solution to reduce dependence on expansive annotated real data. Previous studies primarily focus on local visual representation by leveraging mask image modeling or sequence contrast

Externí odkaz: http://arxiv.org/abs/2405.05841

Zobrazit plný text záznamu

Report

Choose What You Need: Disentangled Representation Learning for Scene Text Recognition, Removal and Editing

Autor: Zhang, Boqiang, Xie, Hongtao, Gao, Zuan, Wang, Yuxin

Scene text images contain not only style information (font, background) but also content information (character, texture). Different scene text tasks need different information, but previous representation learning methods use tightly coupled feature

Externí odkaz: http://arxiv.org/abs/2405.04377

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání