Zobrazeno 1 - 10
of 1 934
pro vyhledávání: '"ZHAI, WEI"'
Variational Autoencoder (VAE) aims to compress pixel data into low-dimensional latent space, playing an important role in OpenAI's Sora and other latent video diffusion generation models. While most of existing video VAEs inflate a pretrained image V
Externí odkaz:
http://arxiv.org/abs/2411.06449
Scene reconstruction from casually captured videos has wide applications in real-world scenarios. With recent advancements in differentiable rendering techniques, several methods have attempted to simultaneously optimize scene representations (NeRF o
Externí odkaz:
http://arxiv.org/abs/2410.15392
Perceiving potential ``action possibilities'' (\ie, affordance) regions of images and learning interactive functionalities of objects from human demonstration is a challenging task due to the diversity of human-object interactions. Prevailing afforda
Externí odkaz:
http://arxiv.org/abs/2410.11363
Autor:
Zhai, Wei, Bai, Nan, Zhao, Qing, Li, Jianqiang, Wang, Fan, Qi, Hongzhi, Jiang, Meng, Wang, Xiaoqin, Yang, Bing Xiang, Fu, Guanghui
As the prevalence of mental health challenges, social media has emerged as a key platform for individuals to express their emotions.Deep learning tends to be a promising solution for analyzing mental health on social media. However, black box models
Externí odkaz:
http://arxiv.org/abs/2410.10323
Recent advancements in multi-modal large language models have propelled the development of joint probabilistic models capable of both image understanding and generation. However, we have identified that recent methods inevitably suffer from loss of i
Externí odkaz:
http://arxiv.org/abs/2410.10798
Zero-shot anomaly detection (ZSAD) recognizes and localizes anomalies in previously unseen objects by establishing feature mapping between textual prompts and inspection images, demonstrating excellent research value in flexible industrial manufactur
Externí odkaz:
http://arxiv.org/abs/2409.20146
Grounding 3D scene affordance aims to locate interactive regions in 3D environments, which is crucial for embodied agents to interact intelligently with their surroundings. Most existing approaches achieve this by mapping semantics to 3D instances ba
Externí odkaz:
http://arxiv.org/abs/2409.19650
First-person hand-object interaction anticipation aims to predict the interaction process over a forthcoming period based on current scenes and prompts. This capability is crucial for embodied intelligence and human-robot collaboration. The complete
Externí odkaz:
http://arxiv.org/abs/2407.21510
Autor:
Wang, Hongyi, Sun, Ji, Liang, Jinzhe, Zhai, Li, Tang, Zitian, Li, Zijian, Zhai, Wei, Wang, Xusheng, Gao, Weihao, Gong, Sheng
The ionic bonding across the lattice and ordered microscopic structures endow crystals with unique symmetry and determine their macroscopic properties. Unconventional crystals, in particular, exhibit non-traditional lattice structures or possess exot
Externí odkaz:
http://arxiv.org/abs/2407.16131
Understanding egocentric human-object interaction (HOI) is a fundamental aspect of human-centric perception, facilitating applications like AR/VR and embodied AI. For the egocentric HOI, in addition to perceiving semantics e.g., ''what'' interaction
Externí odkaz:
http://arxiv.org/abs/2405.13659