Zobrazeno 1 - 10
of 30
pro vyhledávání: '"Bansal, Ankan"'
Autor:
Liao, Haofu, RoyChowdhury, Aruni, Li, Weijian, Bansal, Ankan, Zhang, Yuting, Tu, Zhuowen, Satzoda, Ravi Kumar, Manmatha, R., Mahadevan, Vijay
We present a new formulation for structured information extraction (SIE) from visually rich documents. It aims to address the limitations of existing IOB tagging or graph-based formulations, which are either overly reliant on the correct ordering of
Externí odkaz:
http://arxiv.org/abs/2307.07929
Autor:
Mishra, Shlok, Shah, Anshul, Bansal, Ankan, Jagannatha, Abhyuday, Anjaria, Janit, Sharma, Abhishek, Jacobs, David, Krishnan, Dilip
Publikováno v:
Transactions on Machine Learning Research 2022
A core component of the recent success of self-supervised learning is cropping data augmentation, which selects sub-regions of an image to be used as positive views in the self-supervised loss. The underlying assumption is that randomly cropped and r
Externí odkaz:
http://arxiv.org/abs/2112.00319
Autor:
Mishra, Shlok, Shah, Anshul, Bansal, Ankan, Anjaria, Janit, Choi, Jonghyun, Shrivastava, Abhinav, Sharma, Abhishek, Jacobs, David
Publikováno v:
BMVC 2022
Recent literature has shown that features obtained from supervised training of CNNs may over-emphasize texture rather than encoding high-level information. In self-supervised learning in particular, texture as a low-level cue may provide shortcuts th
Externí odkaz:
http://arxiv.org/abs/2011.01901
Autor:
Shah, Anshul, Mishra, Shlok, Bansal, Ankan, Chen, Jun-Cheng, Chellappa, Rama, Shrivastava, Abhinav
Recent progress on action recognition has mainly focused on RGB and optical flow features. In this paper, we approach the problem of joint-based action recognition. Unlike other modalities, constellation of joints and their motion generate models wit
Externí odkaz:
http://arxiv.org/abs/2010.08164
We introduce the task of Image-Set Visual Question Answering (ISVQA), which generalizes the commonly studied single-image VQA problem to multi-image settings. Taking a natural language question and a set of images as input, it aims to answer the ques
Externí odkaz:
http://arxiv.org/abs/2008.11976
The relative spatial layout of a human and an object is an important cue for determining how they interact. However, until now, spatial layout has been used just as side-information for detecting human-object interactions (HOIs). In this paper, we pr
Externí odkaz:
http://arxiv.org/abs/2004.04851
Autor:
Dhar, Prithviraj, Bansal, Ankan, Castillo, Carlos D., Gleason, Joshua, Phillips, P. Jonathon, Chellappa, Rama
As deep networks become increasingly accurate at recognizing faces, it is vital to understand how these networks process faces. While these networks are solely trained to recognize identities, they also contain face related information such as sex, a
Externí odkaz:
http://arxiv.org/abs/1910.05657
We present an approach for detecting human-object interactions (HOIs) in images, based on the idea that humans interact with functionally similar objects in a similar manner. The proposed model is simple and efficiently uses the data, visual features
Externí odkaz:
http://arxiv.org/abs/1904.03181
Autor:
Ranjan, Rajeev, Bansal, Ankan, Zheng, Jingxiao, Xu, Hongyu, Gleason, Joshua, Lu, Boyu, Nanduri, Anirudh, Chen, Jun-Cheng, Castillo, Carlos D., Chellappa, Rama
The availability of large annotated datasets and affordable computation power have led to impressive improvements in the performance of CNNs on various object detection and recognition benchmarks. These, along with a better understanding of deep lear
Externí odkaz:
http://arxiv.org/abs/1809.07586
We introduce and tackle the problem of zero-shot object detection (ZSD), which aims to detect object classes which are not observed during training. We work with a challenging set of object classes, not restricting ourselves to similar and/or fine-gr
Externí odkaz:
http://arxiv.org/abs/1804.04340