Arabic/Latin and Machine-printed/Handwritten Word Discrimination using HOG-based Shape Descriptor

Autor: Asma Saidani, Afef Kacem, Abdel Belaïd
Přispěvatelé: Technologie de l'Information et de la Communication (UTIC), École Supérieure des Sciences et Technologies de Tunis, Recognition of writing and analysis of documents (READ), Department of Natural Language Processing & Knowledge Discovery (LORIA - NLPKD), Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA), Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Institut National de Recherche en Informatique et en Automatique (Inria)-Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA), Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Institut National de Recherche en Informatique et en Automatique (Inria), Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA), Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)
Rok vydání: 2021
Předmět:
Computer engineering. Computer hardware
Computer science
Speech recognition
Histogram of oriented gradients
ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION
02 engineering and technology
01 natural sciences
Arabic and latin separation
TK7885-7895
Set (abstract data type)
script identification
image analysis and processing document analysis
Discriminative model
Classifier (linguistics)
0202 electrical engineering
electronic engineering
information engineering

[INFO]Computer Science [cs]
Pyramid (image processing)
0101 mathematics
Pixel
business.industry
010102 general mathematics
Pattern recognition
QA75.5-76.95
Script and type identification
Histogram of oriented Gradients
Arabic and Latin separation

Arabic and Latin separation
Histogram of oriented Gradients
Identification (information)
Script and type identification
Electronic computers. Computer science
020201 artificial intelligence & image processing
Computer Vision and Pattern Recognition
Artificial intelligence
business
Software
Word (computer architecture)
Zdroj: Recercat: Dipósit de la Recerca de Catalunya
Varias* (Consorci de Biblioteques Universitáries de Catalunya, Centre de Serveis Científics i Acadèmics de Catalunya)
Electronic Letters on Computer Vision and Image Analysis
Electronic Letters on Computer Vision and Image Analysis, Computer Vision Center Press, 2015, 2 (14), pp.24. ⟨10.5565/rev/elcvia.762⟩
Dipòsit Digital de Documents de la UAB
Universitat Autònoma de Barcelona
ELCVIA: electronic letters on computer vision and image analysis; Vol. 14, Núm. 2 (2015); p. 1-23
Electronic Letters on Computer Vision and Image Analysis, 2015, 2 (14), pp.24. ⟨10.5565/rev/elcvia.762⟩
ELCVIA Electronic Letters on Computer Vision and Image Analysis, Vol 14, Iss 2 (2015)
Recercat. Dipósit de la Recerca de Catalunya
instname
ISSN: 1577-5097
DOI: 10.5565/rev/elcvia.762⟩
Popis: International audience; In this paper, we present an approach for Arabic and Latin script and its type identification based onHistogram of Oriented Gradients (HOG) descriptors. HOGs are first applied at word level based on writingorientation analysis. Then, they are extended to word image partitions to capture fine and discriminativedetails. Pyramid HOG are also used to study their effects on different observation levels of the image.Finally, co-occurrence matrices of HOG are performed to consider spatial information between pairs ofpixels which is not taken into account in basic HOG. A genetic algorithm is applied to select the potentialinformative features combinations which maximizes the classification accuracy. The output is a relativelyshort descriptor that provides an effective input to a Bayes-based classifier. Experimental results on a set ofwords, extracted from standard databases, show that our identification system is robust and provides goodword script and type identification: 99.07% of words are correctly classified.
Databáze: OpenAIRE