Zobrazeno 1 - 10
of 58 434
pro vyhledávání: '"Li , Bo"'
Autor:
Guo, Jarvis, Zheng, Tuney, Bai, Yuelin, Li, Bo, Wang, Yubo, Zhu, King, Li, Yizhi, Neubig, Graham, Chen, Wenhu, Yue, Xiang
Open-source multimodal large language models (MLLMs) have shown significant potential in a broad range of multimodal tasks. However, their reasoning capabilities remain constrained by existing instruction-tuning datasets, which were predominately rep
Externí odkaz:
http://arxiv.org/abs/2412.05237
Autor:
Jain, Swayambhoo, Raju, Ravi, Li, Bo, Csaki, Zoltan, Li, Jonathan, Liang, Kaizhao, Feng, Guoyao, Thakkar, Urmish, Sampat, Anand, Prabhakar, Raghu, Jairath, Sumati
Large Language Models (LLMs) have achieved remarkable advancements, but their monolithic nature presents challenges in terms of scalability, cost, and customization. This paper introduces the Composition of Experts (CoE), a modular compound AI system
Externí odkaz:
http://arxiv.org/abs/2412.01868
Learning lighting adaption is a key step in obtaining a good visual perception and supporting downstream vision tasks. There are multiple light-related tasks (e.g., image retouching and exposure correction) and previous studies have mainly investigat
Externí odkaz:
http://arxiv.org/abs/2412.01493
Despite the significant advancements made by Diffusion Transformer (DiT)-based methods in video generation, there remains a notable gap with controllable camera pose perspectives. Existing works such as OpenSora do NOT adhere precisely to anticipated
Externí odkaz:
http://arxiv.org/abs/2412.01429
This paper studies the performative policy learning problem, where agents adjust their features in response to a released policy to improve their potential outcomes, inducing an endogenous distribution shift. There has been growing interest in traini
Externí odkaz:
http://arxiv.org/abs/2412.01344
In disordered Hermitian systems, localization of energy eigenstates prohibits wave propagation. In non-Hermitian systems, however, wave propagation is possible even when the eigenstates of Hamiltonian are exponentially localized by disorders. We find
Externí odkaz:
http://arxiv.org/abs/2411.19905
Autor:
Zhang, Xinyu, Zhang, Lingling, Wu, Yanrui, Huang, Muye, Wu, Wenjun, Li, Bo, Wang, Shaowei, Liu, Jun
Visual Question Generation (VQG) has gained significant attention due to its potential in educational applications. However, VQG researches mainly focus on natural images, neglecting diagrams in educational materials used to assess students' conceptu
Externí odkaz:
http://arxiv.org/abs/2411.17771
Autor:
Fu, Chaoyou, Zhang, Yi-Fan, Yin, Shukang, Li, Bo, Fang, Xinyu, Zhao, Sirui, Duan, Haodong, Sun, Xing, Liu, Ziwei, Wang, Liang, Shan, Caifeng, He, Ran
As a prominent direction of Artificial General Intelligence (AGI), Multimodal Large Language Models (MLLMs) have garnered increased attention from both industry and academia. Building upon pre-trained LLMs, this family of models further develops mult
Externí odkaz:
http://arxiv.org/abs/2411.15296
Recent advances in Large Multimodal Models (LMMs) lead to significant breakthroughs in both academia and industry. One question that arises is how we, as humans, can understand their internal neural representations. This paper takes an initial step t
Externí odkaz:
http://arxiv.org/abs/2411.14982