Automatic generation of a custom corpora for invoice analysis and recognition
Autor: | Abdel Belaïd, Yolande Belaïd, Jerome Blanchard |
---|---|
Přispěvatelé: | Analyse et Traitement Informatique de la Langue Française (ATILF), Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS), Recognition of writing and analysis of documents (READ), Department of Natural Language Processing & Knowledge Discovery (LORIA - NLPKD), Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA), Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA), Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS) |
Jazyk: | angličtina |
Rok vydání: | 2019 |
Předmět: |
Generator (computer programming)
Invoice Computer science 02 engineering and technology 010501 environmental sciences computer.software_genre 01 natural sciences [INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI] GEDI format 0202 electrical engineering electronic engineering information engineering ComputingMethodologies_DOCUMENTANDTEXTPROCESSING Graph (abstract data type) 020201 artificial intelligence & image processing Confidentiality Invoice generator [INFO]Computer Science [cs] Data mining Graph Convolu- tional Neural Network computer 0105 earth and related environmental sciences |
Zdroj: | ICDAR-WIADAR ICDAR-WIADAR, Sep 2019, Sydney, Australia WIADAR@ICDAR |
Popis: | International audience; In this paper, we present a bill-type document generator capable of supplying on demand all the mass of documents that a learning system needs. The lack of administrative documents has long been a handicap because of the confidentiality of this type of document. In addition, this generator allowed us to solve the problem of annotations since they are done automatically during the generation and put directly in XML-GEDI form. Then, to show the interest of the generator, we proposed a system of invoice recognition based on graph convolutional neural network. The experiments took place in excellent conditions since we had all the possibilities to vary the classes, the samples in the classes, and their parameters. |
Databáze: | OpenAIRE |
Externí odkaz: |