Zobrazeno 1 - 10
of 825
pro vyhledávání: '"LI, Haoyuan"'
Unmanned Aerial Vehicle (UAV) Cross-View Geo-Localization (CVGL) presents significant challenges due to the view discrepancy between oblique UAV images and overhead satellite images. Existing methods heavily rely on the supervision of labeled dataset
Externí odkaz:
http://arxiv.org/abs/2411.14816
Autor:
Huang, Hongzhe, Yu, Zhewen, Liu, Jiang, Cai, Li, Jiao, Dian, Zhang, Wenqiao, Tang, Siliang, Li, Juncheng, Jiang, Hao, Li, Haoyuan, Zhuang, Yueting
Recent advances in Multi-modal Large Language Models (MLLMs), such as LLaVA-series models, are driven by massive machine-generated instruction-following data tuning. Such automatic instruction collection pipelines, however, inadvertently introduce si
Externí odkaz:
http://arxiv.org/abs/2409.18541
Autor:
Chen, Kai, Gou, Yunhao, Huang, Runhui, Liu, Zhili, Tan, Daxin, Xu, Jing, Wang, Chunwei, Zhu, Yi, Zeng, Yihan, Yang, Kuo, Wang, Dingdong, Xiang, Kun, Li, Haoyuan, Bai, Haoli, Han, Jianhua, Li, Xiaohui, Jin, Weike, Xie, Nian, Zhang, Yu, Kwok, James T., Zhao, Hengshuang, Liang, Xiaodan, Yeung, Dit-Yan, Chen, Xiao, Li, Zhenguo, Zhang, Wei, Liu, Qun, Yao, Jun, Hong, Lanqing, Hou, Lu, Xu, Hang
GPT-4o, an omni-modal model that enables vocal conversations with diverse emotions and tones, marks a milestone for omni-modal foundation models. However, empowering Large Language Models to perceive and generate images, texts, and speeches end-to-en
Externí odkaz:
http://arxiv.org/abs/2409.18042
Federated learning (FL) has emerged as a prominent method for collaboratively training machine learning models using local data from edge devices, all while keeping data decentralized. However, accounting for the quality of data contributed by local
Externí odkaz:
http://arxiv.org/abs/2409.02189
Autor:
Shu, Fangxun, Liao, Yue, Zhuo, Le, Xu, Chenning, Zhang, Lei, Zhang, Guanghao, Shi, Haonan, Chen, Long, Zhong, Tao, He, Wanggui, Fu, Siming, Li, Haoyuan, Li, Bolin, Yu, Zhelun, Liu, Si, Li, Hongsheng, Jiang, Hao
We introduce LLaVA-MoD, a novel framework designed to enable the efficient training of small-scale Multimodal Language Models (s-MLLM) by distilling knowledge from large-scale MLLM (l-MLLM). Our approach tackles two fundamental challenges in MLLM dis
Externí odkaz:
http://arxiv.org/abs/2408.15881
More accurate capacitance extraction is demanded for designing integrated circuits under advanced process technology. The pattern matching approach and the field solver for capacitance extraction have the drawbacks of inaccuracy and large computation
Externí odkaz:
http://arxiv.org/abs/2408.13195
Autor:
Lin, Tianwei, Liu, Jiang, Zhang, Wenqiao, Li, Zhaocheng, Dai, Yang, Li, Haoyuan, Yu, Zhelun, He, Wanggui, Li, Juncheng, Jiang, Hao, Tang, Siliang, Zhuang, Yueting
While Parameter-Efficient Fine-Tuning (PEFT) methods like LoRA have effectively addressed GPU memory constraints during fine-tuning, their performance often falls short, especially in multidimensional task scenarios. To address this issue, one straig
Externí odkaz:
http://arxiv.org/abs/2408.09856
Autor:
Sun, Yanwen, Chen, Chaobo, Albert, Thies J., Li, Haoyuan, Arefev, Mikhail I., Chen, Ying, Dunne, Mike, Glownia, James M., Hoffmann, Matthias, Hurley, Matthew J., Mo, Mianzhen, Nguyen, Quynh L., Sato, Takahiro, Song, Sanghoon, Sun, Peihao, Sutton, Mark, Teitelbaum, Samuel, Valavanis, Antonios S., Wang, Nan, Zhu, Diling, Zhigilei, Leonid V., Sokolowski-Tinten, Klaus
Femtosecond laser ablation is a process that bears both fundamental physics interest and has wide industrial applications. For decades, the lack of probes on the relevant time and length scales has prevented access to the highly nonequilibrium phase
Externí odkaz:
http://arxiv.org/abs/2407.10505
Autor:
He, Wanggui, Fu, Siming, Liu, Mushui, Wang, Xierui, Xiao, Wenyi, Shu, Fangxun, Wang, Yi, Zhang, Lei, Yu, Zhelun, Li, Haoyuan, Huang, Ziwei, Gan, LeiLei, Jiang, Hao
Auto-regressive models have made significant progress in the realm of language generation, yet they do not perform on par with diffusion models in the domain of image synthesis. In this work, we introduce MARS, a novel framework for T2I generation th
Externí odkaz:
http://arxiv.org/abs/2407.07614
Autor:
Wang, Ye, Xun, Jiahao, Hong, Minjie, Zhu, Jieming, Jin, Tao, Lin, Wang, Li, Haoyuan, Li, Linjun, Xia, Yan, Zhao, Zhou, Dong, Zhenhua
Generative retrieval has recently emerged as a promising approach to sequential recommendation, framing candidate item retrieval as an autoregressive sequence generation problem. However, existing generative methods typically focus solely on either b
Externí odkaz:
http://arxiv.org/abs/2406.14017