Zobrazeno 1 - 10
of 48
pro vyhledávání: '"Zhang, Zaiwei"'
Autor:
Zhang, Zaiwei, Meyer, Gregory P., Lu, Zhichao, Shrivastava, Ashish, Ravichandran, Avinash, Wolff, Eric M.
For visual recognition, knowledge distillation typically involves transferring knowledge from a large, well-trained teacher model to a smaller student model. In this paper, we introduce an effective method to distill knowledge from an off-the-shelf v
Externí odkaz:
http://arxiv.org/abs/2408.16930
Compositionality is a common property in many modalities including natural languages and images, but the compositional generalization of multi-modal models is not well-understood. In this paper, we identify two sources of visual-linguistic compositio
Externí odkaz:
http://arxiv.org/abs/2310.02777
Autor:
Yang, Haitao, Zhang, Zaiwei, Huang, Xiangru, Bai, Min, Song, Chen, Sun, Bo, Li, Li Erran, Huang, Qixing
Bird's-Eye View (BEV) features are popular intermediate scene representations shared by the 3D backbone and the detector head in LiDAR-based object detectors. However, little research has been done to investigate how to incorporate additional supervi
Externí odkaz:
http://arxiv.org/abs/2304.01519
Reconstructing 3D objects is an important computer vision task that has wide application in AR/VR. Deep learning algorithm developed for this task usually relies on an unrealistic synthetic dataset, such as ShapeNet and Things3D. On the other hand, e
Externí odkaz:
http://arxiv.org/abs/2206.12356
Reconstructing an accurate 3D object model from a few image observations remains a challenging problem in computer vision. State-of-the-art approaches typically assume accurate camera poses as input, which could be difficult to obtain in realistic se
Externí odkaz:
http://arxiv.org/abs/2205.07763
Autor:
Yang, Haitao, Zhang, Zaiwei, Yan, Siming, Huang, Haibin, Ma, Chongyang, Zheng, Yi, Bajaj, Chandrajit, Huang, Qixing
Developing deep neural networks to generate 3D scenes is a fundamental problem in neural synthesis with immediate applications in architectural CAD, computer graphics, as well as in generating virtual robot training environments. This task is challen
Externí odkaz:
http://arxiv.org/abs/2108.13499
This paper introduces an unsupervised loss for training parametric deformation shape generators. The key idea is to enforce the preservation of local rigidity among the generated shapes. Our approach builds on an approximation of the as-rigid-as poss
Externí odkaz:
http://arxiv.org/abs/2108.09432
Pretraining on large labeled datasets is a prerequisite to achieve good performance in many computer vision tasks like 2D object recognition, video classification etc. However, pretraining is not widely used for 3D recognition tasks where state-of-th
Externí odkaz:
http://arxiv.org/abs/2101.02691
We introduce H3DNet, which takes a colorless 3D point cloud as input and outputs a collection of oriented object bounding boxes (or BB) and their semantic labels. The critical idea of H3DNet is to predict a hybrid set of geometric primitives, i.e., B
Externí odkaz:
http://arxiv.org/abs/2006.05682
In this paper, we introduce the problem of jointly learning feed-forward neural networks across a set of relevant but diverse datasets. Compared to learning a separate network from each dataset in isolation, joint learning enables us to extract corre
Externí odkaz:
http://arxiv.org/abs/1905.06526