Popis: |
Several theories in cognitive neuroscience suggest that when people interact with the world, or simulate interactions, they do so from a first-person egocentric perspective, and seamlessly transfer knowledge between third-person (observer) and first-person (actor). Despite this, learning such models for human action recognition has not been well studied. We address this challenge by introducing Charades-Ego, a large-scale dataset of paired first-person and third-person videos, and presenting a formulation to learn a joint representation of actions from these two perspectives. This talk will present this dataset and our actor-observer model. |