Zobrazeno 1 - 10
of 35 732
pro vyhledávání: '"Yujun An"'
In recent years, as the population ages, falls have increasingly posed a significant threat to the health of the elderly. We propose a real-time fall detection system that integrates the inertial measurement unit (IMU) of a smartphone with optimized
Externí odkaz:
http://arxiv.org/abs/2412.09980
The potential of Wi-Fi backscatter communications systems is immense, yet challenges such as signal instability and energy constraints impose performance limits. This paper introduces FlexScatter, a Wi-Fi backscatter system using a designed schedulin
Externí odkaz:
http://arxiv.org/abs/2412.08982
To address the challenges of robust data transmission over complex time-varying channels, this paper introduces channel learning and enhanced adaptive reconstruction (CLEAR) strategy for semantic communications. CLEAR integrates deep joint source-cha
Externí odkaz:
http://arxiv.org/abs/2412.08978
Autor:
Lu, Fan, Wu, Wei, Zheng, Kecheng, Ma, Shuailei, Gong, Biao, Liu, Jiawei, Zhai, Wei, Cao, Yang, Shen, Yujun, Zha, Zheng-Jun
Generating detailed captions comprehending text-rich visual content in images has received growing attention for Large Vision-Language Models (LVLMs). However, few studies have developed benchmarks specifically tailored for detailed captions to measu
Externí odkaz:
http://arxiv.org/abs/2412.08614
Autor:
Ma, Shuailei, Zheng, Kecheng, Wei, Ying, Wu, Wei, Lu, Fan, Zhang, Yifei, Xie, Chen-Wei, Gong, Biao, Zhu, Jiapeng, Shen, Yujun
Although text-to-image (T2I) models have recently thrived as visual generative priors, their reliance on high-quality text-image pairs makes scaling up expensive. We argue that grasping the cross-modality alignment is not a necessity for a sound visu
Externí odkaz:
http://arxiv.org/abs/2412.07767
This paper presents PlanarSplatting, an ultra-fast and accurate surface reconstruction approach for multiview indoor images. We take the 3D planes as the main objective due to their compactness and structural expressiveness in indoor scenes, and deve
Externí odkaz:
http://arxiv.org/abs/2412.03451
Autor:
Tan, Shuai, Gong, Biao, Feng, Yutong, Zheng, Kecheng, Zheng, Dandan, Shi, Shuwei, Shen, Yujun, Chen, Jingdong, Yang, Ming
Text serves as the key control signal in video generation due to its narrative nature. To render text descriptions into video clips, current video diffusion models borrow features from text encoders yet struggle with limited text comprehension. The r
Externí odkaz:
http://arxiv.org/abs/2412.03085
Autor:
Salter, Sasha, Warren, Richard, Schlager, Collin, Spurr, Adrian, Han, Shangchen, Bhasin, Rohin, Cai, Yujun, Walkington, Peter, Bolarinwa, Anuoluwapo, Wang, Robert, Danielson, Nathan, Merel, Josh, Pnevmatikakis, Eftychios, Marshall, Jesse
Hands are the primary means through which humans interact with the world. Reliable and always-available hand pose inference could yield new and intuitive control schemes for human-computer interactions, particularly in virtual and augmented reality.
Externí odkaz:
http://arxiv.org/abs/2412.02725
Autor:
Li, Zehao, Han, Wenwei, Cai, Yujun, Jiang, Hao, Bi, Baolong, Gao, Shuqin, Zhao, Honglong, Wang, Zhaoqi
While 3D Gaussian Splatting enables high-quality real-time rendering, existing Gaussian-based frameworks for 3D semantic segmentation still face significant challenges in boundary recognition accuracy. To address this, we propose a novel 3DGS-based f
Externí odkaz:
http://arxiv.org/abs/2412.00392
Despite inheriting security measures from underlying language models, Vision-Language Models (VLMs) may still be vulnerable to safety alignment issues. Through empirical analysis, we uncover two critical findings: scenario-matched images can signific
Externí odkaz:
http://arxiv.org/abs/2411.18000