Zobrazeno 1 - 3
of 3
pro vyhledávání: '"Barnwal, Rohit"'
Autor:
Radfar, Martin, Barnwal, Rohit, Swaminathan, Rupak Vignesh, Chang, Feng-Ju, Strimel, Grant P., Susanj, Nathan, Mouchtaris, Athanasios
The recurrent neural network transducer (RNN-T) is a prominent streaming end-to-end (E2E) ASR technology. In RNN-T, the acoustic encoder commonly consists of stacks of LSTMs. Very recently, as an alternative to LSTM layers, the Conformer architecture
Externí odkaz:
http://arxiv.org/abs/2209.14868
We tackle the challenge of Visual Question Answering in multi-image setting for the ISVQA dataset. Traditional VQA tasks have focused on a single-image setting where the target answer is generated from a single image. Image set VQA, however, comprise
Externí odkaz:
http://arxiv.org/abs/2104.00107
Publikováno v:
2013 IEEE International Conference on Multimedia & Expo Workshops (ICMEW); 2013, p1-4, 4p