Zobrazeno 1 - 10
of 826
pro vyhledávání: '"Yang, ShiQi"'
Autor:
Lu, Chenhao, Cheng, Xuxin, Li, Jialong, Yang, Shiqi, Ji, Mazeyu, Yuan, Chengjing, Yang, Ge, Yi, Sha, Wang, Xiaolong
Humanoid robots require both robust lower-body locomotion and precise upper-body manipulation. While recent Reinforcement Learning (RL) approaches provide whole-body loco-manipulation policies, they lack precise manipulation with high DoF arms. In th
Externí odkaz:
http://arxiv.org/abs/2412.07773
Autor:
Zhao, Mengjie, Zhong, Zhi, Mao, Zhuoyuan, Yang, Shiqi, Liao, Wei-Hsiang, Takahashi, Shusuke, Wakaki, Hiromi, Mitsufuji, Yuki
We present OpenMU-Bench, a large-scale benchmark suite for addressing the data scarcity issue in training multimodal language models to understand music. To construct OpenMU-Bench, we leveraged existing datasets and bootstrapped new annotations. Open
Externí odkaz:
http://arxiv.org/abs/2410.15573
Autor:
Mirza, M. Jehanzeb, Zhao, Mengjie, Mao, Zhuoyuan, Doveh, Sivan, Lin, Wei, Gavrikov, Paul, Dorkenwald, Michael, Yang, Shiqi, Jha, Saurav, Wakaki, Hiromi, Mitsufuji, Yuki, Possegger, Horst, Feris, Rogerio, Karlinsky, Leonid, Glass, James
In this work, we propose a novel method (GLOV) enabling Large Language Models (LLMs) to act as implicit Optimizers for Vision-Langugage Models (VLMs) to enhance downstream vision tasks. Our GLOV meta-prompts an LLM with the downstream task descriptio
Externí odkaz:
http://arxiv.org/abs/2410.06154
Autor:
Jha, Saurav, Yang, Shiqi, Ishii, Masato, Zhao, Mengjie, Simon, Christian, Mirza, Muhammad Jehanzeb, Gong, Dong, Yao, Lina, Takahashi, Shusuke, Mitsufuji, Yuki
Personalized text-to-image diffusion models have grown popular for their ability to efficiently acquire a new concept from user-defined text descriptions and a few images. However, in the real world, a user may wish to personalize a model on multiple
Externí odkaz:
http://arxiv.org/abs/2410.00700
Autor:
Yang, Shiqi, Liu, Minghuan, Qin, Yuzhe, Ding, Runyu, Li, Jialong, Cheng, Xuxin, Yang, Ruihan, Yi, Sha, Wang, Xiaolong
Learning from demonstrations has shown to be an effective approach to robotic manipulation, especially with the recently collected large-scale robot data with teleoperation systems. Building an efficient teleoperation system across diverse robot plat
Externí odkaz:
http://arxiv.org/abs/2408.11805
Autor:
Ding, Runyu, Qin, Yuzhe, Zhu, Jiyue, Jia, Chengzhe, Yang, Shiqi, Yang, Ruihan, Qi, Xiaojuan, Wang, Xiaolong
Teleoperation is a crucial tool for collecting human demonstrations, but controlling robots with bimanual dexterous hands remains a challenge. Existing teleoperation systems struggle to handle the complexity of coordinating two hands for intricate ma
Externí odkaz:
http://arxiv.org/abs/2407.03162
Teleoperation serves as a powerful method for collecting on-robot data essential for robot learning from demonstrations. The intuitiveness and ease of use of the teleoperation system are crucial for ensuring high-quality, diverse, and scalable data.
Externí odkaz:
http://arxiv.org/abs/2407.01512
Autor:
Comunità, Marco, Zhong, Zhi, Takahashi, Akira, Yang, Shiqi, Zhao, Mengjie, Saito, Koichi, Ikemiya, Yukara, Shibuya, Takashi, Takahashi, Shusuke, Mitsufuji, Yuki
Recent advances in generative models that iteratively synthesize audio clips sparked great success to text-to-audio synthesis (TTA), but with the cost of slow synthesis speed and heavy computation. Although there have been attempts to accelerate the
Externí odkaz:
http://arxiv.org/abs/2406.17672
Autor:
Huang, Xinyue, Song, Zhigang, Gao, Yuchen, Gu, Pingfan, Watanabe, Kenji, Taniguchi, Takashi, Yang, Shiqi, Chen, Zuxin, Ye, Yu
We present a comprehensive investigation of optical properties in MoSe$_2$/CrSBr heterostructures, unveiling the presence of localized excitons represented by a new emission feature, X$^*$. We demonstrate through temperature- and power-dependent phot
Externí odkaz:
http://arxiv.org/abs/2405.16079
Autor:
Yang, Shiqi, Zhong, Zhi, Zhao, Mengjie, Takahashi, Shusuke, Ishii, Masato, Shibuya, Takashi, Mitsufuji, Yuki
In recent years, with the realistic generation results and a wide range of personalized applications, diffusion-based generative models gain huge attention in both visual and audio generation areas. Compared to the considerable advancements of text2i
Externí odkaz:
http://arxiv.org/abs/2405.14598