Zobrazeno 1 - 10
of 51
pro vyhledávání: '"Wick, Christoph"'
Autor:
Wick, Christoph
In recent years, great progress has been made in the area of Artificial Intelligence (AI) due to the possibilities of Deep Learning which steadily yielded new state-of-the-art results especially in many image recognition tasks. Currently, in some are
In contrast to Connectionist Temporal Classification (CTC) approaches, Sequence-To-Sequence (S2S) models for Handwritten Text Recognition (HTR) suffer from errors such as skipped or repeated words which often occur at the end of a sequence. In this p
Externí odkaz:
http://arxiv.org/abs/2110.05909
Autor:
Reul, Christian, Wick, Christoph, Nöth, Maximilian, Büttner, Andreas, Wehner, Maximilian, Springmann, Uwe
In order to apply Optical Character Recognition (OCR) to historical printings of Latin script fully automatically, we report on our efforts to construct a widely-applicable polyfont recognition model yielding text with a Character Error Rate (CER) ar
Externí odkaz:
http://arxiv.org/abs/2106.07881
Publikováno v:
MDPI Information 2021, vol. 12 nr. 11, article-nr. 443
Currently, the most widespread neural network architecture for training language models is the so called BERT which led to improvements in various Natural Language Processing (NLP) tasks. In general, the larger the number of parameters in a BERT mode
Externí odkaz:
http://arxiv.org/abs/2104.11559
Autor:
Reul, Christian, Christ, Dennis, Hartelt, Alexander, Balbach, Nico, Wehner, Maximilian, Springmann, Uwe, Wick, Christoph, Grundig, Christine, Büttner, Andreas, Puppe, Frank
Optical Character Recognition (OCR) on historical printings is a challenging task mainly due to the complexity of the layout and the highly variant typography. Nevertheless, in the last few years great progress has been made in the area of historical
Externí odkaz:
http://arxiv.org/abs/1909.04032
In this paper we evaluate Optical Character Recognition (OCR) of 19th century Fraktur scripts without book-specific training using mixed models, i.e. models trained to recognize a variety of fonts and typesets from previously unseen sources. We descr
Externí odkaz:
http://arxiv.org/abs/1810.03436
Publikováno v:
Digital Humanities Quarterly 14 (2), 2020
Optical Character Recognition (OCR) on contemporary and historical data is still in the focus of many researchers. Especially historical prints require book specific trained OCR models to achieve applicable results (Springmann and L\"udeling, 2016, R
Externí odkaz:
http://arxiv.org/abs/1807.02004
We combine three methods which significantly improve the OCR accuracy of OCR models trained on early printed books: (1) The pretraining method utilizes the information stored in already existing models trained on a variety of typesets (mixed models)
Externí odkaz:
http://arxiv.org/abs/1802.10038
This paper proposes a combination of a convolutional and a LSTM network to improve the accuracy of OCR on early printed books. While the standard model of line based OCR uses a single LSTM layer, we utilize a CNN- and Pooling-Layer combination in adv
Externí odkaz:
http://arxiv.org/abs/1802.10033
A method is presented that significantly reduces the character error rates for OCR text obtained from OCRopus models trained on early printed books when only small amounts of diplomatic transcriptions are available. This is achieved by building from
Externí odkaz:
http://arxiv.org/abs/1712.05586