Zobrazeno 1 - 10
of 636
pro vyhledávání: '"Li, Yining"'
Autor:
Zhang, Pan, Dong, Xiaoyi, Zang, Yuhang, Cao, Yuhang, Qian, Rui, Chen, Lin, Guo, Qipeng, Duan, Haodong, Wang, Bin, Ouyang, Linke, Zhang, Songyang, Zhang, Wenwei, Li, Yining, Gao, Yang, Sun, Peng, Zhang, Xinyue, Li, Wei, Li, Jingwen, Wang, Wenhai, Yan, Hang, He, Conghui, Zhang, Xingcheng, Chen, Kai, Dai, Jifeng, Qiao, Yu, Lin, Dahua, Wang, Jiaqi
We present InternLM-XComposer-2.5 (IXC-2.5), a versatile large-vision language model that supports long-contextual input and output. IXC-2.5 excels in various text-image comprehension and composition applications, achieving GPT-4V level capabilities
Externí odkaz:
http://arxiv.org/abs/2407.03320
Autor:
Chen, Yicheng, Li, Xiangtai, Li, Yining, Zeng, Yanhong, Wu, Jianzong, Zhao, Xiangyu, Chen, Kai
Diffusion-based models have shown great potential in generating high-quality images with various layouts, which can benefit downstream perception tasks. However, a fully automatic layout generation driven only by language and a suitable metric for me
Externí odkaz:
http://arxiv.org/abs/2406.20085
Multi-modal large language models (MLLMs) have made significant strides in various visual understanding tasks. However, the majority of these models are constrained to process low-resolution images, which limits their effectiveness in perception task
Externí odkaz:
http://arxiv.org/abs/2406.17770
Autor:
Wu, Jianzong, Li, Xiangtai, Zeng, Yanhong, Zhang, Jiangning, Zhou, Qianyu, Li, Yining, Tong, Yunhai, Chen, Kai
In this work, we present MotionBooth, an innovative framework designed for animating customized subjects with precise control over both object and camera movements. By leveraging a few images of a specific object, we efficiently fine-tune a text-to-v
Externí odkaz:
http://arxiv.org/abs/2406.17758
Autor:
Fei, Zhiwei, Zhang, Songyang, Shen, Xiaoyu, Zhu, Dawei, Wang, Xiao, Cao, Maosong, Zhou, Fengzhe, Li, Yining, Zhang, Wenwei, Lin, Dahua, Chen, Kai, Ge, Jidong
While large language models (LLMs) have showcased impressive capabilities, they struggle with addressing legal queries due to the intricate complexities and specialized expertise required in the legal field. In this paper, we introduce InternLM-Law,
Externí odkaz:
http://arxiv.org/abs/2406.14887
The advent of large vision-language models (LVLMs) has spurred research into their applications in multi-modal contexts, particularly in video understanding. Traditional VideoQA benchmarks, despite providing quantitative metrics, often fail to encomp
Externí odkaz:
http://arxiv.org/abs/2406.14515
Autor:
Hu, Kai, Yu, Weichen, Yao, Tianjun, Li, Xiang, Liu, Wenhe, Yu, Lijun, Li, Yining, Chen, Kai, Shen, Zhiqiang, Fredrikson, Matt
Recent research indicates that large language models (LLMs) are susceptible to jailbreaking attacks that can generate harmful content. This paper introduces a novel token-level attack method, Adaptive Dense-to-Sparse Constrained Optimization (ADC), w
Externí odkaz:
http://arxiv.org/abs/2405.09113
Autor:
Dong, Xiaoyi, Zhang, Pan, Zang, Yuhang, Cao, Yuhang, Wang, Bin, Ouyang, Linke, Zhang, Songyang, Duan, Haodong, Zhang, Wenwei, Li, Yining, Yan, Hang, Gao, Yang, Chen, Zhe, Zhang, Xinyue, Li, Wei, Li, Jingwen, Wang, Wenhai, Chen, Kai, He, Conghui, Zhang, Xingcheng, Dai, Jifeng, Qiao, Yu, Lin, Dahua, Wang, Jiaqi
The Large Vision-Language Model (LVLM) field has seen significant advancements, yet its progression has been hindered by challenges in comprehending fine-grained visual content due to limited resolution. Recent efforts have aimed to enhance the high-
Externí odkaz:
http://arxiv.org/abs/2404.06512
Autor:
Cai, Zheng, Cao, Maosong, Chen, Haojiong, Chen, Kai, Chen, Keyu, Chen, Xin, Chen, Xun, Chen, Zehui, Chen, Zhi, Chu, Pei, Dong, Xiaoyi, Duan, Haodong, Fan, Qi, Fei, Zhaoye, Gao, Yang, Ge, Jiaye, Gu, Chenya, Gu, Yuzhe, Gui, Tao, Guo, Aijia, Guo, Qipeng, He, Conghui, Hu, Yingfan, Huang, Ting, Jiang, Tao, Jiao, Penglong, Jin, Zhenjiang, Lei, Zhikai, Li, Jiaxing, Li, Jingwen, Li, Linyang, Li, Shuaibin, Li, Wei, Li, Yining, Liu, Hongwei, Liu, Jiangning, Hong, Jiawei, Liu, Kaiwen, Liu, Kuikun, Liu, Xiaoran, Lv, Chengqi, Lv, Haijun, Lv, Kai, Ma, Li, Ma, Runyuan, Ma, Zerun, Ning, Wenchang, Ouyang, Linke, Qiu, Jiantao, Qu, Yuan, Shang, Fukai, Shao, Yunfan, Song, Demin, Song, Zifan, Sui, Zhihao, Sun, Peng, Sun, Yu, Tang, Huanze, Wang, Bin, Wang, Guoteng, Wang, Jiaqi, Wang, Jiayu, Wang, Rui, Wang, Yudong, Wang, Ziyi, Wei, Xingjian, Weng, Qizhen, Wu, Fan, Xiong, Yingtong, Xu, Chao, Xu, Ruiliang, Yan, Hang, Yan, Yirong, Yang, Xiaogui, Ye, Haochen, Ying, Huaiyuan, Yu, Jia, Yu, Jing, Zang, Yuhang, Zhang, Chuyu, Zhang, Li, Zhang, Pan, Zhang, Peng, Zhang, Ruijie, Zhang, Shuo, Zhang, Songyang, Zhang, Wenjian, Zhang, Wenwei, Zhang, Xingcheng, Zhang, Xinyue, Zhao, Hui, Zhao, Qian, Zhao, Xiaomeng, Zhou, Fengzhe, Zhou, Zaida, Zhuo, Jingming, Zou, Yicheng, Qiu, Xipeng, Qiao, Yu, Lin, Dahua
The evolution of Large Language Models (LLMs) like ChatGPT and GPT-4 has sparked discussions on the advent of Artificial General Intelligence (AGI). However, replicating such advancements in open-source models has been challenging. This paper introdu
Externí odkaz:
http://arxiv.org/abs/2403.17297
Autor:
Dong, Xiaoyi, Zhang, Pan, Zang, Yuhang, Cao, Yuhang, Wang, Bin, Ouyang, Linke, Wei, Xilin, Zhang, Songyang, Duan, Haodong, Cao, Maosong, Zhang, Wenwei, Li, Yining, Yan, Hang, Gao, Yang, Zhang, Xinyue, Li, Wei, Li, Jingwen, Chen, Kai, He, Conghui, Zhang, Xingcheng, Qiao, Yu, Lin, Dahua, Wang, Jiaqi
We introduce InternLM-XComposer2, a cutting-edge vision-language model excelling in free-form text-image composition and comprehension. This model goes beyond conventional vision-language understanding, adeptly crafting interleaved text-image content
Externí odkaz:
http://arxiv.org/abs/2401.16420