Zobrazeno 1 - 3
of 3
pro vyhledávání: '"Angie Boggust"'
Autor:
Rameswar Panda, Andrew Rouditchenko, Hilde Kuehne, Angie Boggust, James Glass, Rogerio Feris, David Harwath, Brian Chen, Brian Kingsbury, Samuel Thomas, Michael Picheny
In this paper, we explore self-supervised audio-visual models that learn from instructional videos. Prior work has shown that these models can relate spoken words and sounds to visual content after training on a large-scale dataset of videos, but the
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::4e8574a412328c0178369baec3688670
Saliency methods -- techniques to identify the importance of input features on a model's output -- are a common step in understanding neural network behavior. However, interpreting saliency requires tedious manual inspection to identify and aggregate
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::a6f00211456b8f8c857216ab0b84a002
Autor:
Kartik Audhkhasi, Angie Boggust, Rogerio Feris, Andrew Rouditchenko, Rameswar Panda, Brian Chen, James Glass, Dhiraj Joshi, Michael Picheny, Antonio Torralba, David Harwath, Brian Kingsbury, Hilde Kuehne, Samuel Thomas
Current methods for learning visually grounded language from videos often rely on text annotation, such as human generated captions or machine generated automatic speech recognition (ASR) transcripts. In this work, we introduce the Audio-Video Langua
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::c9b214fc7b0b9e46b2c19f75b611c289