Abstrakt: |
Research in human/computer interaction has mainly focused on natural language, text, speech and vision primarily in isolation. Recently there have been a number of research projects that have concentrated on the integration of such modalities using intelligent reasoners. The rationale is that many inherent ambiguities in single modes of communication can be resolved if extra information is available.This paper describes an intelligent multi-modal system called the Smart Work Manager. The main characteristics of the Smart Work Manager are that it can process speech, text, face images, gaze information and simulated gestures using the mouse as input modalities, and its output is in the form of speech, text or graphics. The main components of the system are the reasoner, a speech system, a vision system, an integration platform and the application interface. The overall architecture of the system will be described together with the integration platform and the components of the system which include a non-intrusive neural network based gaze-tracking system. The paper concludes with a discussion on the applicability of such systems to intelligent human/computer interaction and lessons learnt in terms of reliability and efficiency. [ABSTRACT FROM AUTHOR] |