Zobrazeno 1 - 10
of 16
pro vyhledávání: '"Ghose, Shuvozit"'
Point cloud understanding is an inherently challenging problem because of the sparse and unordered structure of the point cloud in the 3D space. Recently, Contrastive Vision-Language Pre-training (CLIP) based point cloud classification model i.e. Poi
Externí odkaz:
http://arxiv.org/abs/2408.03545
Autor:
Ghose, Shuvozit, Wang, Yang
Point cloud classification refers to the process of assigning semantic labels or categories to individual points within a point cloud data structure. Recent works have explored the extension of pre-trained CLIP to 3D recognition. In this direction, C
Externí odkaz:
http://arxiv.org/abs/2404.00857
Autor:
Bhunia, Ayan Kumar, Sain, Aneeshan, Kumar, Amandeep, Ghose, Shuvozit, Chowdhury, Pinaki Nath, Song, Yi-Zhe
Although text recognition has significantly evolved over the years, state-of-the-art (SOTA) models still struggle in the wild scenarios due to complex backgrounds, varying fonts, uncontrolled illuminations, distortions and other artefacts. This is be
Externí odkaz:
http://arxiv.org/abs/2107.12090
Autor:
Maruyama, Mizuki, Ghose, Shuvozit, Inoue, Katsufumi, Roy, Partha Pratim, Iwamura, Masakazu, Yoshioka, Michifumi
In recent years, Word-level Sign Language Recognition (WSLR) research has gained popularity in the computer vision community, and thus various approaches have been proposed. Among these approaches, the method using I3D network achieves the highest re
Externí odkaz:
http://arxiv.org/abs/2106.15989
Autor:
Bhunia, Ayan Kumar, Ghose, Shuvozit, Kumar, Amandeep, Chowdhury, Pinaki Nath, Sain, Aneeshan, Song, Yi-Zhe
Handwritten Text Recognition (HTR) remains a challenging problem to date, largely due to the varying writing styles that exist amongst us. Prior works however generally operate with the assumption that there is a limited number of styles, most of whi
Externí odkaz:
http://arxiv.org/abs/2104.01876
Degraded document image binarization is one of the most challenging tasks in the domain of document image analysis. In this paper, we present a novel approach towards document image binarization by introducing three-player min-max adversarial game. W
Externí odkaz:
http://arxiv.org/abs/2007.07075
Ground Terrain Recognition is a difficult task as the context information varies significantly over the regions of a ground terrain image. In this paper, we propose a novel approach towards ground-terrain recognition via modeling the Extent-of-Textur
Externí odkaz:
http://arxiv.org/abs/2004.08141
Autor:
Bhunia, Ayan Kumar, Bhunia, Ankan Kumar, Ghose, Shuvozit, Das, Abhirup, Roy, Partha Pratim, Pal, Umapada
Logo detection in real-world scene images is an important problem with applications in advertisement and marketing. Existing general-purpose object detection methods require large training data with annotations for every logo class. These methods do
Externí odkaz:
http://arxiv.org/abs/1811.01395
Thumbnails are widely used all over the world as a preview for digital images. In this work we propose a deep neural framework to generate thumbnails of any size and aspect ratio, even for unseen values during training, with high accuracy and precisi
Externí odkaz:
http://arxiv.org/abs/1810.13054
In this paper, a new texture descriptor named "Fractional Local Neighborhood Intensity Pattern" (FLNIP) has been proposed for content based image retrieval (CBIR). It is an extension of the Local Neighborhood Intensity Pattern (LNIP)[1]. FLNIP calcul
Externí odkaz:
http://arxiv.org/abs/1801.00187