Multimodal Driver Interaction with Gesture, Gaze and Speech

Autor:	Abdul Rafey Aftab
Rok vydání:	2019
Předmět:	Artificial neural network Computer science Semantic interpretation 020206 networking & telecommunications 02 engineering and technology Sensor fusion Semantics Gaze Human–computer interaction Gesture recognition 0202 electrical engineering electronic engineering information engineering Eye tracking 020201 artificial intelligence & image processing Gesture
Zdroj:	ICMI
DOI:	10.1145/3340555.3356093
Popis:	The ever-growing research in computer vision has created new avenues for user interaction. Speech commands and gesture recognition are already being applied in various touch-based inputs. It is, therefore, foreseeable, that the use of multimodal input methods for user interaction is the next phase in development. In this paper, I propose a research plan of novel methods for the use of multimodal inputs for the semantic interpretation of human-computer interaction, specifically applied to a car driver. A fusion methodology has to be designed that adequately makes use of a recognized gesture (specifically finger pointing), eye gaze and head pose for the identification of reference objects, while using the semantics from speech for a natural interactive environment for the driver. The proposed plan includes different techniques based on artificial neural networks for the fusion of the camera-based modalities (gaze, head and gesture). It then combines features extracted from speech with the fusion algorithm to determine the intent of the driver.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_________::2c49ca54dd286662b00ca7fc03506606 https://doi.org/10.1145/3340555.3356093 Zobrazit plný text záznamu