RusTitW: Russian Language Text Dataset for Visual Text in-the-Wild Recognition
Autor: | Markov, Igor, Nesteruk, Sergey, Kuznetsov, Andrey, Dimitrov, Denis |
---|---|
Rok vydání: | 2023 |
Předmět: | |
Druh dokumentu: | Working Paper |
Popis: | Information surrounds people in modern life. Text is a very efficient type of information that people use for communication for centuries. However, automated text-in-the-wild recognition remains a challenging problem. The major limitation for a DL system is the lack of training data. For the competitive performance, training set must contain many samples that replicate the real-world cases. While there are many high-quality datasets for English text recognition; there are no available datasets for Russian language. In this paper, we present a large-scale human-labeled dataset for Russian text recognition in-the-wild. We also publish a synthetic dataset and code to reproduce the generation process Comment: 5 pages, 6 figures, 2 tables |
Databáze: | arXiv |
Externí odkaz: |