Arabic/Latin and Machine-printed/Handwritten Word Discrimination using HOG-based Shape Descriptor
Autor: | Asma Saidani, Afef Kacem, Abdel Belaïd |
---|---|
Přispěvatelé: | Technologie de l'Information et de la Communication (UTIC), École Supérieure des Sciences et Technologies de Tunis, Recognition of writing and analysis of documents (READ), Department of Natural Language Processing & Knowledge Discovery (LORIA - NLPKD), Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA), Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Institut National de Recherche en Informatique et en Automatique (Inria)-Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA), Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Institut National de Recherche en Informatique et en Automatique (Inria), Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA), Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS) |
Rok vydání: | 2021 |
Předmět: |
Computer engineering. Computer hardware
Computer science Speech recognition Histogram of oriented gradients ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION 02 engineering and technology 01 natural sciences Arabic and latin separation TK7885-7895 Set (abstract data type) script identification image analysis and processing document analysis Discriminative model Classifier (linguistics) 0202 electrical engineering electronic engineering information engineering [INFO]Computer Science [cs] Pyramid (image processing) 0101 mathematics Pixel business.industry 010102 general mathematics Pattern recognition QA75.5-76.95 Script and type identification Histogram of oriented Gradients Arabic and Latin separation Arabic and Latin separation Histogram of oriented Gradients Identification (information) Script and type identification Electronic computers. Computer science 020201 artificial intelligence & image processing Computer Vision and Pattern Recognition Artificial intelligence business Software Word (computer architecture) |
Zdroj: | Recercat: Dipósit de la Recerca de Catalunya Varias* (Consorci de Biblioteques Universitáries de Catalunya, Centre de Serveis Científics i Acadèmics de Catalunya) Electronic Letters on Computer Vision and Image Analysis Electronic Letters on Computer Vision and Image Analysis, Computer Vision Center Press, 2015, 2 (14), pp.24. ⟨10.5565/rev/elcvia.762⟩ Dipòsit Digital de Documents de la UAB Universitat Autònoma de Barcelona ELCVIA: electronic letters on computer vision and image analysis; Vol. 14, Núm. 2 (2015); p. 1-23 Electronic Letters on Computer Vision and Image Analysis, 2015, 2 (14), pp.24. ⟨10.5565/rev/elcvia.762⟩ ELCVIA Electronic Letters on Computer Vision and Image Analysis, Vol 14, Iss 2 (2015) Recercat. Dipósit de la Recerca de Catalunya instname |
ISSN: | 1577-5097 |
DOI: | 10.5565/rev/elcvia.762⟩ |
Popis: | International audience; In this paper, we present an approach for Arabic and Latin script and its type identification based onHistogram of Oriented Gradients (HOG) descriptors. HOGs are first applied at word level based on writingorientation analysis. Then, they are extended to word image partitions to capture fine and discriminativedetails. Pyramid HOG are also used to study their effects on different observation levels of the image.Finally, co-occurrence matrices of HOG are performed to consider spatial information between pairs ofpixels which is not taken into account in basic HOG. A genetic algorithm is applied to select the potentialinformative features combinations which maximizes the classification accuracy. The output is a relativelyshort descriptor that provides an effective input to a Bayes-based classifier. Experimental results on a set ofwords, extracted from standard databases, show that our identification system is robust and provides goodword script and type identification: 99.07% of words are correctly classified. |
Databáze: | OpenAIRE |
Externí odkaz: |