Zobrazeno 1 - 10
of 56
pro vyhledávání: '"Feng, Chengjian"'
Automated Machine Learning (AutoML) offers a promising approach to streamline the training of machine learning models. However, existing AutoML frameworks are often limited to unimodal scenarios and require extensive manual configuration. Recent adva
Externí odkaz:
http://arxiv.org/abs/2408.00665
Autor:
Wang, Hao, Ren, Pengzhen, Jie, Zequn, Dong, Xiao, Feng, Chengjian, Qian, Yinlong, Ma, Lin, Jiang, Dongmei, Wang, Yaowei, Lan, Xiangyuan, Liang, Xiaodan
Open-vocabulary detection is a challenging task due to the requirement of detecting objects based on class names, including those not encountered during training. Existing methods have shown strong zero-shot detection capabilities through pre-trainin
Externí odkaz:
http://arxiv.org/abs/2407.07844
Foundation models hold significant potential for enabling robots to perform long-horizon general manipulation tasks. However, the simplicity of tasks and the uniformity of environments in existing benchmarks restrict their effective deployment in com
Externí odkaz:
http://arxiv.org/abs/2407.06951
Utilizing Vision-Language Models (VLMs) for robotic manipulation represents a novel paradigm, aiming to enhance the model's ability to generalize to new objects and instructions. However, due to variations in camera specifications and mounting positi
Externí odkaz:
http://arxiv.org/abs/2406.18977
Temporal Action Detection (TAD) focuses on detecting pre-defined actions, while Moment Retrieval (MR) aims to identify the events described by open-ended natural language within untrimmed videos. Despite that they focus on different events, we observ
Externí odkaz:
http://arxiv.org/abs/2404.04933
In this paper, we present a novel paradigm to enhance the ability of object detector, e.g., expanding categories or improving detection performance, by training on synthetic dataset generated from diffusion models. Specifically, we integrate an insta
Externí odkaz:
http://arxiv.org/abs/2402.05937
Autor:
Zhou, Sifan, Tian, Zhi, Chu, Xiangxiang, Zhang, Xinyu, Zhang, Bo, Lu, Xiaobo, Feng, Chengjian, Jie, Zequn, Chiang, Patrick Yin, Ma, Lin
The deployment of 3D detectors strikes one of the major challenges in real-world self-driving scenarios. Existing BEV-based (i.e., Bird Eye View) detectors favor sparse convolutions (known as SPConv) to speed up training and inference, which puts a h
Externí odkaz:
http://arxiv.org/abs/2302.02367
Recent LSS-based multi-view 3D object detection has made tremendous progress, by processing the features in Brid-Eye-View (BEV) via the convolutional detector. However, the typical convolution ignores the radial symmetry of the BEV features and incre
Externí odkaz:
http://arxiv.org/abs/2211.12501
Autor:
Feng, Chengjian, Zhong, Yujie, Jie, Zequn, Chu, Xiangxiang, Ren, Haibing, Wei, Xiaolin, Xie, Weidi, Ma, Lin
The goal of this work is to establish a scalable pipeline for expanding an object detector towards novel/unseen categories, using zero manual annotations. To achieve that, we make the following four contributions: (i) in pursuit of generalisation, we
Externí odkaz:
http://arxiv.org/abs/2203.16513
One-stage object detection is commonly implemented by optimizing two sub-tasks: object classification and localization, using heads with two parallel branches, which might lead to a certain level of spatial misalignment in predictions between the two
Externí odkaz:
http://arxiv.org/abs/2108.07755