Zobrazeno 1 - 10
of 1 320
pro vyhledávání: '"Liu, Zhijian"'
Autor:
Liu, Zhijian, Zhang, Zhuoyang, Khaki, Samir, Yang, Shang, Tang, Haotian, Xu, Chenfeng, Keutzer, Kurt, Han, Song
Semantic segmentation empowers numerous real-world applications, such as autonomous driving and augmented/mixed reality. These applications often operate on high-resolution images (e.g., 8 megapixels) to capture the fine details. However, this comes
Externí odkaz:
http://arxiv.org/abs/2407.19014
We present LidarDM, a novel LiDAR generative model capable of producing realistic, layout-aware, physically plausible, and temporally coherent LiDAR videos. LidarDM stands out with two unprecedented capabilities in LiDAR generative modeling: (i) LiDA
Externí odkaz:
http://arxiv.org/abs/2404.02903
Autor:
Kodaira, Akio, Xu, Chenfeng, Hazama, Toshiki, Yoshimoto, Takanori, Ohno, Kohei, Mitsuhori, Shogo, Sugano, Soichi, Cho, Hanying, Liu, Zhijian, Keutzer, Kurt
We introduce StreamDiffusion, a real-time diffusion pipeline designed for interactive image generation. Existing diffusion models are adept at creating images from text or image prompts, yet they often fall short in real-time interaction. This limita
Externí odkaz:
http://arxiv.org/abs/2312.12491
Autor:
Wu, Xiaoyang, Jiang, Li, Wang, Peng-Shuai, Liu, Zhijian, Liu, Xihui, Qiao, Yu, Ouyang, Wanli, He, Tong, Zhao, Hengshuang
This paper is not motivated to seek innovation within the attention mechanism. Instead, it focuses on overcoming the existing trade-offs between accuracy and efficiency within the context of point cloud processing, leveraging the power of scale. Draw
Externí odkaz:
http://arxiv.org/abs/2312.10035
Autor:
Tang, Haotian, Yang, Shang, Liu, Zhijian, Hong, Ke, Yu, Zhongming, Li, Xiuyu, Dai, Guohao, Wang, Yu, Han, Song
Sparse convolution plays a pivotal role in emerging workloads, including point cloud processing in AR/VR, autonomous driving, and graph understanding in recommendation systems. Since the computation pattern is sparse and irregular, specialized high-p
Externí odkaz:
http://arxiv.org/abs/2311.12862
We present LongLoRA, an efficient fine-tuning approach that extends the context sizes of pre-trained large language models (LLMs), with limited computation cost. Typically, training LLMs with long context sizes is computationally expensive, requiring
Externí odkaz:
http://arxiv.org/abs/2309.12307
Despite tremendous advancements in bird's-eye view (BEV) perception, existing models fall short in generating realistic and coherent semantic map layouts, and they fail to account for uncertainties arising from partial sensor information (such as occ
Externí odkaz:
http://arxiv.org/abs/2308.12963
Publikováno v:
Signal, Image and Video Processing,2023
Automatic hardhat wearing detection can strengthen the safety management in construction sites, which is still challenging due to complicated video surveillance scenes. To deal with the poor generalization of previous deep learning based methods, a n
Externí odkaz:
http://arxiv.org/abs/2307.04103
High-resolution images enable neural networks to learn richer visual representations. However, this improved performance comes at the cost of growing computational complexity, hindering their usage in latency-sensitive applications. As not all pixels
Externí odkaz:
http://arxiv.org/abs/2303.17605
Transformer, as an alternative to CNN, has been proven effective in many modalities (e.g., texts and images). For 3D point cloud transformers, existing efforts focus primarily on pushing their accuracy to the state-of-the-art level. However, their la
Externí odkaz:
http://arxiv.org/abs/2301.08739