Zobrazeno 1 - 10
of 31
pro vyhledávání: '"Radu Soricut"'
Publikováno v:
AAAI
Human ratings are currently the most accurate way to assess the quality of an image captioning model, yet most often the only used outcome of an expensive human rating evaluation is a few overall statistics over the evaluation dataset. In this paper,
Publikováno v:
Lecture Notes in Computer Science ISBN: 9783031198298
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_________::812f616293abde5f8d90d80803c6f7f7
https://doi.org/10.1007/978-3-031-19830-4_15
https://doi.org/10.1007/978-3-031-19830-4_15
Autor:
Yu-Chuan Su, Soravit Changpinyo, Xiangning Chen, Sathish Thoppay, Cho-Jui Hsieh, Lior Shapira, Radu Soricut, Hartwig Adam, Matthew Brown, Ming-Hsuan Yang, Boqing Gong
Publikováno v:
Computer Vision and Image Understanding. 224:103557
Visual 2.5D perception involves understanding the semantics and geometry of a scene through reasoning about object relationships with respect to the viewer in an environment. However, existing works in visual recognition primarily focus on the semant
Autor:
Zhenhai Zhu, Radu Soricut
Publikováno v:
ACL/IJCNLP (1)
We describe an efficient hierarchical method to compute attention in the Transformer architecture. The proposed attention mechanism exploits a matrix structure similar to the Hierarchical Matrix (H-Matrix) developed by the numerical analysis communit
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::deb839f81ac52a2c0849f39d2050f8bf
http://arxiv.org/abs/2107.11906
http://arxiv.org/abs/2107.11906
Publikováno v:
CVPR
The availability of large-scale image captioning and visual question answering datasets has contributed significantly to recent successes in vision-and-language pre-training. However, these datasets are often collected with overrestrictive requiremen
Publikováno v:
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing.
Publikováno v:
Findings of the Association for Computational Linguistics: EMNLP 2021.
Developers of text generation models rely on automated evaluation metrics as a stand-in for slow and expensive manual evaluations. However, image captioning metrics have struggled to give accurate learned estimates of the semantic and pragmatic succe
Recent advances in automatic evaluation metrics for text have shown that deep contextualized word representations, such as those generated by BERT encoders, are helpful for designing metrics that correlate well with human judgements. At the same time
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::379f019489195617a2991d1372612136
http://arxiv.org/abs/2010.06150
http://arxiv.org/abs/2010.06150
Publikováno v:
EMNLP (1)
Sequence generation models trained with teacher-forcing suffer from issues related to exposure bias and lack of differentiability across timesteps. Our proposed method, Teacher-Forcing with N-grams (TeaForN), addresses both these problems directly, t
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::570223774c753827326117f5965bf4ef
http://arxiv.org/abs/2010.03494
http://arxiv.org/abs/2010.03494
Autor:
Radu Soricut, Ashish V. Thapliyal
Publikováno v:
ACL
Cross-modal language generation tasks such as image captioning are directly hurt in their ability to support non-English languages by the trend of data-hungry models combined with the lack of non-English annotations. We investigate potential solution
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::30dbb9c07c6a02c446ec3872730d3d67