Zobrazeno 1 - 10
of 229
pro vyhledávání: '"R. Manmatha"'
Autor:
Ron Slossberg, Oron Anschel, Amir Markovitz, Ron Litman, Aviad Aberdam, Shahar Tsiper, Shai Mazor, Jon Wu, R. Manmatha
Publikováno v:
Lecture Notes in Computer Science ISBN: 9783031250682
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_________::061617dcca38ee47f02eaf3dc52b4df7
https://doi.org/10.1007/978-3-031-25069-9_18
https://doi.org/10.1007/978-3-031-25069-9_18
Publikováno v:
Lecture Notes in Computer Science ISBN: 9783031250842
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_________::8c447c9cf8919b29b31548b443658940
https://doi.org/10.1007/978-3-031-25085-9_1
https://doi.org/10.1007/978-3-031-25085-9_1
Publikováno v:
Lecture Notes in Computer Science ISBN: 9783031198144
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_________::df451cc1ac8ccdad47c47447f4707f96
https://doi.org/10.1007/978-3-031-19815-1_15
https://doi.org/10.1007/978-3-031-19815-1_15
Autor:
Yi Zhu, Zhongyue Zhang, Chongruo Wu, Zhi Zhang, Tong He, Hang Zhang, R Manmatha, Mu Li, Alexander J Smola
Publikováno v:
IEEE transactions on pattern analysis and machine intelligence.
Starting from the seminal work of Fully Convolutional Networks (FCN), there has been significant progress on semantic segmentation. However, deep learning models often require large amounts of pixelwise annotations to train accurate and robust models
Akademický článek
Tento výsledek nelze pro nepřihlášené uživatele zobrazit.
K zobrazení výsledku je třeba se přihlásit.
K zobrazení výsledku je třeba se přihlásit.
We present DocFormer -- a multi-modal transformer based architecture for the task of Visual Document Understanding (VDU). VDU is a challenging problem which aims to understand documents in their varied formats (forms, receipts etc.) and layouts. In a
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::355c4a6a8697c19b2b01f6b77af88c1f
http://arxiv.org/abs/2106.11539
http://arxiv.org/abs/2106.11539
Publikováno v:
WACV
This paper proposes a new end-to-end trainable model for lossy image compression, which includes several novel components. The method incorporates 1) an adequate perceptual similarity metric; 2) saliency in the images; 3) a hierarchical auto-regressi
Autor:
Hang Zhang, Chongruo Wu, Zhongyue Zhang, Yi Zhu, Haibin Lin, Zhi Zhang, Yue Sun, Tong He, Jonas Mueller, R. Manmatha, Mu Li, Alexander Smola
It is well known that featuremap attention and multi-path representation are important for visual recognition. In this paper, we present a modularized architecture, which applies the channel-wise attention on different network branches to leverage th
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::7e36b4f1135cce7f906520035eb619ea
http://arxiv.org/abs/2004.08955
http://arxiv.org/abs/2004.08955
Publikováno v:
CVPR
Scene Text Recognition (STR), the task of recognizing text against complex image backgrounds, is an active area of research. Current state-of-the-art (SOTA) methods still struggle to recognize text written in arbitrary shapes. In this paper, we intro
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::413d34bbfe68c6981641263036796e2d
Autor:
Oron Anschel, Shai Mazor, Ron Litman, Shahar Tsiper, Aviad Aberdam, R. Manmatha, Pietro Perona, Ron Slossberg
Publikováno v:
CVPR
We propose a framework for sequence-to-sequence contrastive learning (SeqCLR) of visual representations, which we apply to text recognition. To account for the sequence-to-sequence structure, each feature map is divided into different instances over
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::3c8ad4d49d82fdfb5664597c43fba3f6