Zobrazeno 1 - 10
of 627
pro vyhledávání: '"Lin, Weihong"'
In this paper, we present a new question-answering (QA) based key-value pair extraction approach, called KVPFormer, to robustly extracting key-value relationships between entities from form-like document images. Specifically, KVPFormer first identifi
Externí odkaz:
http://arxiv.org/abs/2304.07957
We present a new table structure recognition (TSR) approach, called TSRFormer, to robustly recognizing the structures of complex tables with geometrical distortions from various table images. Unlike previous methods, we formulate table separation lin
Externí odkaz:
http://arxiv.org/abs/2303.11615
Autor:
Liang, Weicong, Yuan, Yuhui, Ding, Henghui, Luo, Xiao, Lin, Weihong, Jia, Ding, Zhang, Zheng, Zhang, Chao, Hu, Han
Vision transformers have recently achieved competitive results across various vision tasks but still suffer from heavy computation costs when processing a large number of tokens. Many advanced approaches have been developed to reduce the total number
Externí odkaz:
http://arxiv.org/abs/2210.01035
We present a new table structure recognition (TSR) approach, called TSRFormer, to robustly recognizing the structures of complex tables with geometrical distortions from various table images. Unlike previous methods, we formulate table separation lin
Externí odkaz:
http://arxiv.org/abs/2208.04921
Autor:
Jia, Ding, Yuan, Yuhui, He, Haodi, Wu, Xiaopei, Yu, Haojun, Lin, Weihong, Sun, Lei, Zhang, Chao, Hu, Han
One-to-one set matching is a key design for DETR to establish its end-to-end capability, so that object detection does not require a hand-crafted NMS (non-maximum suppression) to remove duplicate detections. This end-to-end signature is important for
Externí odkaz:
http://arxiv.org/abs/2207.13080
We introduce a new table detection and structure recognition approach named RobusTabNet to detect the boundaries of tables and reconstruct the cellular structure of each table from heterogeneous document images. For table detection, we propose to use
Externí odkaz:
http://arxiv.org/abs/2203.09056
We present a High-Resolution Transformer (HRFormer) that learns high-resolution representations for dense prediction tasks, in contrast to the original Vision Transformer that produces low-resolution representations and has high memory and computatio
Externí odkaz:
http://arxiv.org/abs/2110.09408
Recent grid-based document representations like BERTgrid allow the simultaneous encoding of the textual and layout information of a document in a 2D feature map so that state-of-the-art image segmentation and/or object detection models can be straigh
Externí odkaz:
http://arxiv.org/abs/2105.11672
Akademický článek
Tento výsledek nelze pro nepřihlášené uživatele zobrazit.
K zobrazení výsledku je třeba se přihlásit.
K zobrazení výsledku je třeba se přihlásit.
Akademický článek
Tento výsledek nelze pro nepřihlášené uživatele zobrazit.
K zobrazení výsledku je třeba se přihlásit.
K zobrazení výsledku je třeba se přihlásit.