Multimodal Driver Interaction with Gesture, Gaze and Speech
Autor: | Abdul Rafey Aftab |
---|---|
Rok vydání: | 2019 |
Předmět: |
Artificial neural network
Computer science Semantic interpretation 020206 networking & telecommunications 02 engineering and technology Sensor fusion Semantics Gaze Human–computer interaction Gesture recognition 0202 electrical engineering electronic engineering information engineering Eye tracking 020201 artificial intelligence & image processing Gesture |
Zdroj: | ICMI |
DOI: | 10.1145/3340555.3356093 |
Popis: | The ever-growing research in computer vision has created new avenues for user interaction. Speech commands and gesture recognition are already being applied in various touch-based inputs. It is, therefore, foreseeable, that the use of multimodal input methods for user interaction is the next phase in development. In this paper, I propose a research plan of novel methods for the use of multimodal inputs for the semantic interpretation of human-computer interaction, specifically applied to a car driver. A fusion methodology has to be designed that adequately makes use of a recognized gesture (specifically finger pointing), eye gaze and head pose for the identification of reference objects, while using the semantics from speech for a natural interactive environment for the driver. The proposed plan includes different techniques based on artificial neural networks for the fusion of the camera-based modalities (gaze, head and gesture). It then combines features extracted from speech with the fusion algorithm to determine the intent of the driver. |
Databáze: | OpenAIRE |
Externí odkaz: |