Zobrazeno 1 - 10
of 15 336
pro vyhledávání: '"Yang,Yue"'
Autor:
Zhou, Pengfei, Peng, Xiaopeng, Song, Jiajun, Li, Chuanhao, Xu, Zhaopan, Yang, Yue, Guo, Ziyao, Zhang, Hao, Lin, Yuqi, He, Yefei, Zhao, Lirui, Liu, Shuo, Li, Tianhua, Xie, Yuxuan, Chang, Xiaojun, Qiao, Yu, Shao, Wenqi, Zhang, Kaipeng
Multimodal Large Language Models (MLLMs) have made significant strides in visual understanding and generation tasks. However, generating interleaved image-text content remains a challenge, which requires integrated multimodal understanding and genera
Externí odkaz:
http://arxiv.org/abs/2411.18499
Electroencephalography (EEG) is essential in neuroscience and clinical practice, yet it suffers from physiological artifacts, particularly electromyography (EMG), which distort signals. We propose a deep learning model using pix2pixGAN to remove such
Externí odkaz:
http://arxiv.org/abs/2411.13288
Robot Imitation Learning (IL) is a crucial technique in robot learning, where agents learn by mimicking human demonstrations. However, IL encounters scalability challenges stemming from both non-user-friendly demonstration collection methods and the
Externí odkaz:
http://arxiv.org/abs/2410.15994
We simulated the intermittent boundary-layer flashback (BLF) of hydrogen-enriched swirling flames using large-eddy simulation (LES) with the flame-surface-density (FSD) method. Three cases of intermittent BLF, characterized by periodic flame entry an
Externí odkaz:
http://arxiv.org/abs/2410.15988
The proliferation of inflammatory or misleading "fake" news content has become increasingly common in recent years. Simultaneously, it has become easier than ever to use AI tools to generate photorealistic images depicting any scene imaginable. Combi
Externí odkaz:
http://arxiv.org/abs/2410.09045
Large Vision-Language Models (LVLMs) have demonstrated remarkable capabilities across multimodal tasks such as visual perception and reasoning, leading to good performance on various multimodal evaluation benchmarks. However, these benchmarks keep a
Externí odkaz:
http://arxiv.org/abs/2410.08695
Autor:
Le, Long, Xie, Jason, Liang, William, Wang, Hung-Ju, Yang, Yue, Ma, Yecheng Jason, Vedder, Kyle, Krishna, Arjun, Jayaraman, Dinesh, Eaton, Eric
Interactive 3D simulated objects are crucial in AR/VR, animations, and robotics, driving immersive experiences and advanced automation. However, creating these articulated objects requires extensive human effort and expertise, limiting their broader
Externí odkaz:
http://arxiv.org/abs/2410.13882
Autor:
Wang, Zhaowei, Zhang, Hongming, Fang, Tianqing, Tian, Ye, Yang, Yue, Ma, Kaixin, Pan, Xiaoman, Song, Yangqiu, Yu, Dong
Object navigation in unknown environments is crucial for deploying embodied agents in real-world applications. While we have witnessed huge progress due to large-scale scene datasets, faster simulators, and stronger models, previous studies mainly fo
Externí odkaz:
http://arxiv.org/abs/2410.02730
Autor:
Zhang, Han, Killeen, Benjamin D., Ku, Yu-Chun, Seenivasan, Lalithkumar, Zhao, Yuxuan, Liu, Mingxu, Yang, Yue, Gu, Suxi, Martin-Gomez, Alejandro, Taylor, Russell H., Osgood, Greg, Unberath, Mathias
In percutaneous pelvic trauma surgery, accurate placement of Kirschner wires (K-wires) is crucial to ensure effective fracture fixation and avoid complications due to breaching the cortical bone along an unsuitable trajectory. Surgical navigation via
Externí odkaz:
http://arxiv.org/abs/2410.01143
Autor:
Deitke, Matt, Clark, Christopher, Lee, Sangho, Tripathi, Rohun, Yang, Yue, Park, Jae Sung, Salehi, Mohammadreza, Muennighoff, Niklas, Lo, Kyle, Soldaini, Luca, Lu, Jiasen, Anderson, Taira, Bransom, Erin, Ehsani, Kiana, Ngo, Huong, Chen, YenSung, Patel, Ajay, Yatskar, Mark, Callison-Burch, Chris, Head, Andrew, Hendrix, Rose, Bastani, Favyen, VanderBilt, Eli, Lambert, Nathan, Chou, Yvonne, Chheda, Arnavi, Sparks, Jenna, Skjonsberg, Sam, Schmitz, Michael, Sarnat, Aaron, Bischoff, Byron, Walsh, Pete, Newell, Chris, Wolters, Piper, Gupta, Tanmay, Zeng, Kuo-Hao, Borchardt, Jon, Groeneveld, Dirk, Nam, Crystal, Lebrecht, Sophie, Wittlif, Caitlin, Schoenick, Carissa, Michel, Oscar, Krishna, Ranjay, Weihs, Luca, Smith, Noah A., Hajishirzi, Hannaneh, Girshick, Ross, Farhadi, Ali, Kembhavi, Aniruddha
Today's most advanced vision-language models (VLMs) remain proprietary. The strongest open-weight models rely heavily on synthetic data from proprietary VLMs to achieve good performance, effectively distilling these closed VLMs into open ones. As a r
Externí odkaz:
http://arxiv.org/abs/2409.17146