Combining Speech and Handwriting Modalities for Mathematical Expression Recognition

Autor: Harold Mouchère, Christian Viard-Gaudin, Simon Petitrenaud, Sofiane Medjkoune
Přispěvatelé: Laboratoire des Sciences du Numérique de Nantes (LS2N), Université de Nantes - UFR des Sciences et des Techniques (UN UFR ST), Université de Nantes (UN)-Université de Nantes (UN)-École Centrale de Nantes (ECN)-Centre National de la Recherche Scientifique (CNRS)-IMT Atlantique Bretagne-Pays de la Loire (IMT Atlantique), Institut Mines-Télécom [Paris] (IMT)-Institut Mines-Télécom [Paris] (IMT), Laboratoire d'Informatique de l'Université du Mans (LIUM), Le Mans Université (UM), IMT Atlantique Bretagne-Pays de la Loire (IMT Atlantique), Institut Mines-Télécom [Paris] (IMT)-Institut Mines-Télécom [Paris] (IMT)-Université de Nantes - UFR des Sciences et des Techniques (UN UFR ST), Université de Nantes (UN)-Université de Nantes (UN)-École Centrale de Nantes (ECN)-Centre National de la Recherche Scientifique (CNRS)
Jazyk: angličtina
Rok vydání: 2017
Předmět:
Computer Networks and Communications
Intelligent character recognition
Computer science
Speech recognition
Human Factors and Ergonomics
Context (language use)
02 engineering and technology
computer.software_genre
Symbol (chemistry)
[INFO.INFO-LG]Computer Science [cs]/Machine Learning [cs.LG]
[INFO.INFO-TS]Computer Science [cs]/Signal and Image Processing
Artificial Intelligence
Handwriting
0202 electrical engineering
electronic engineering
information engineering

Use case
ComputingMilieux_MISCELLANEOUS
ACM: I.: Computing Methodologies/I.5: PATTERN RECOGNITION
Modalities
Modality (human–computer interaction)
business.industry
020206 networking & telecommunications
Computer Science Applications
Human-Computer Interaction
Control and Systems Engineering
Handwriting recognition
Signal Processing
020201 artificial intelligence & image processing
Artificial intelligence
business
computer
Natural language processing
Zdroj: IEEE Transactions on Human-Machine Systems
IEEE Transactions on Human-Machine Systems, IEEE, 2017, 47 (2), pp.259-272. ⟨10.1109/THMS.2017.2647850⟩
ISSN: 2168-2291
DOI: 10.1109/THMS.2017.2647850⟩
Popis: In this paper, we open new perspectives for mathematical expression recognition by introducing an original bimodal system. Since handwritten mathematical expression recognition is a very challenging task prone to many ambiguities, we use speech as an additional modality to circumvent limitations that are inherent to the written form. A use case scenario corresponds to lectures given in classrooms where the teacher would write and read aloud any mathematical expressions to allow a better interpretation. In addition to state-of-the-art solutions for recognizing handwriting and speech, we introduce a multilayer architecture for the merger of modalities. Specifically, the Dempster–Shafer theory is used to process the information at the symbol level. This bimodal system is evaluated on real bimodal data, the HAMEX dataset. Large improvements are observed when speech and handwriting are combined when compared to the single handwriting modality.
Databáze: OpenAIRE