You-Do, I-Learn: Egocentric unsupervised discovery of objects and their modes of interaction towards video-based guidance

Autor:	Walterio W. Mayol-Cuevas, Teesid Leelasawassuk, Dima Damen
Rok vydání:	2016
Předmět:	Thesaurus (information retrieval) Object Discovery business.industry Computer science Assistive Computing Wearable computer 020207 software engineering 02 engineering and technology Object (computer science) Gaze Motion (physics) Task (project management) Mode (computer interface) Video tracking Object Usage Signal Processing Real-time Computer Vision 0202 electrical engineering electronic engineering information engineering Video Guidance 020201 artificial intelligence & image processing Computer vision Computer Vision and Pattern Recognition Artificial intelligence business Software
Zdroj:	Damen, D, Leelasawassuk, T & Mayol-Cuevas, W W 2016, ' You-Do, I-Learn : Egocentric Unsupervised Discovery of Objects and their Modes of Interaction Towards Video-Based Guidance ', Computer Vision and Image Understanding, vol. 149, pp. 98-112 . https://doi.org/10.1016/j.cviu.2016.02.016
ISSN:	1077-3142
DOI:	10.1016/j.cviu.2016.02.016
Popis:	Discovering task-relevant objects from egocentric video sequences of multiple users, using appearance, position, motion and attention features.Distinguishing different ways in which a task-relevant object has been used.Automatically extracting usage snippets, to be used for video-based guidance.Tested on a variety of daily tasks such as initialising a printer, preparing a coffee and setting up a gym machine. Display Omitted This paper presents an unsupervised approach towards automatically extracting video-based guidance on object usage, from egocentric video and wearable gaze tracking, collected from multiple users while performing tasks. The approach (i)źdiscovers task relevant objects, (ii) builds a model for each, (iii)źdistinguishes different ways in which each discovered object has been used and (iv)źdiscovers the dependencies between object interactions. The work investigates using appearance, position, motion and attention, and presents results using each and a combination of relevant features. Moreover, an online scalable approach is presented and is compared to offline results. The paper proposes a method for selecting a suitable video guide to be displayed to a novice user indicating how to use an object, purely triggered by the user's gaze. The potential assistive mode can also recommend an object to be used next based on the learnt sequence of object interactions. The approach was tested on a variety of daily tasks such as initialising a printer, preparing a coffee and setting up a gym machine.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_dedup___::226593932424f79e65e1eb3e0e3daad0 https://doi.org/10.1016/j.cviu.2016.02.016 Zobrazit plný text záznamu Full Text from ScienceDirect