Zobrazeno 1 - 10
of 2 932
pro vyhledávání: '"Albanie, S"'
Publikováno v:
European Conference on Computer Vision (ECCV) 2022
European Conference on Computer Vision (ECCV) 2022, Oct 2022, Tel Aviv, Israel. pp.671-690, ⟨10.1007/978-3-031-19833-5_39⟩
Lecture Notes in Computer Science ISBN: 9783031198328
European Conference on Computer Vision (ECCV) 2022, Oct 2022, Tel Aviv, Israel. pp.671-690, ⟨10.1007/978-3-031-19833-5_39⟩
Lecture Notes in Computer Science ISBN: 9783031198328
Recently, sign language researchers have turned to sign language interpreted TV broadcasts, comprising (i) a video of continuous signing and (ii) subtitles corresponding to the audio content, as a readily available and large-scale source of training
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::d922b8ca5d034d81326aa4516cf7b228
https://hal.science/hal-03981733
https://hal.science/hal-03981733
Akademický článek
Tento výsledek nelze pro nepřihlášené uživatele zobrazit.
K zobrazení výsledku je třeba se přihlásit.
K zobrazení výsledku je třeba se přihlásit.
Akademický článek
Tento výsledek nelze pro nepřihlášené uživatele zobrazit.
K zobrazení výsledku je třeba se přihlásit.
K zobrazení výsledku je třeba se přihlásit.
The focus of this work is $\textit{sign spotting}$ - given a video of an isolated sign, our task is to identify $\textit{whether}$ and $\textit{where}$ it has been signed in a continuous, co-articulated sign language video. To achieve this sign spott
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::ef07ed692b83376614e1cb0879669a91
https://www.repository.cam.ac.uk/handle/1810/337554
https://www.repository.cam.ac.uk/handle/1810/337554
Publikováno v:
2021 IEEE/CVF International Conference on Computer Vision (ICCV).
In recent years, considerable progress on the task of text-video retrieval has been achieved by leveraging large-scale pretraining on visual and audio datasets to construct powerful video encoders. By contrast, despite the natural symmetry, the desig
Publikováno v:
International Conference on Computer Vision (ICCV)
International Conference on Computer Vision (ICCV), Oct 2021, Montreal, Canada
International Conference on Computer Vision (ICCV), Oct 2021, Montreal, Canada
International audience; The goal of this work is to temporally align asynchronous subtitles in sign language videos. In particular, we focus on sign-language interpreted TV broadcast data comprising (i) a video of continuous signing, and (ii) subtitl
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::b5e8212b3ee5db02ee3297fbe81ec08f
http://arxiv.org/abs/2105.02877
http://arxiv.org/abs/2105.02877
Autor:
Albanie, S, Varol, G, Momeni, L, Afouras, T, Brown, A, Zhang, C, Coto, E, Camgoz, NC, Saunders, B, Dutta, A, Fox, N, Bowden, R, Woll, B, Zisserman, A
In this work, we propose a framework that enables collection of large-scale, diverse sign language datasets that can be used to train automatic sign language recognition models. The first contribution of this work is SDTRACK, a generic method for sig
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=od______1064::2327eb624f05c24b0891ec2b99101f74
https://ora.ox.ac.uk/objects/uuid:ffd79f6f-5ff4-4e9d-b635-0e29493d2030
https://ora.ox.ac.uk/objects/uuid:ffd79f6f-5ff4-4e9d-b635-0e29493d2030
The goal of this work is to automatically determine whether and when a word of interest is spoken by a talking face, with or without the audio. We propose a zero-shot method suitable for ‘in the wild’ videos. Our key contributions are: (1) a nove
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::c0a290c09221293f2fe023cb75d19752
The rapid growth of video on the internet has made searching for video content using natural language queries a significant challenge. Human-generated queries for video datasets `in the wild' vary a lot in terms of degree of specificity, with some qu
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::0510df38d04449f1ac1dc413c671af5a
https://ora.ox.ac.uk/objects/uuid:502da19a-2a9c-45f4-95f0-ee09ecf77340
https://ora.ox.ac.uk/objects/uuid:502da19a-2a9c-45f4-95f0-ee09ecf77340
While the use of bottom-up local operators in convolutional neural networks (CNNs) matches well some of the statistics of natural images, it may also prevent such models from capturing contextual long-range feature interactions. In this work, we prop
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::a3d97fb2770014e2e45393de5d809190