Zobrazeno 1 - 10
of 17
pro vyhledávání: '"Jiang Dongfu"'
Autor:
Chen, Jiacheng, Liang, Tianhao, Siu, Sherman, Wang, Zhengqing, Wang, Kai, Wang, Yubo, Ni, Yuansheng, Zhu, Wang, Jiang, Ziyan, Lyu, Bohan, Jiang, Dongfu, He, Xuan, Liu, Yuan, Hu, Hexiang, Yue, Xiang, Chen, Wenhu
We present MEGA-Bench, an evaluation suite that scales multimodal evaluation to over 500 real-world tasks, to address the highly heterogeneous daily use cases of end users. Our objective is to optimize for a set of high-quality data samples that cove
Externí odkaz:
http://arxiv.org/abs/2410.10563
Autor:
He, Xuan, Jiang, Dongfu, Zhang, Ge, Ku, Max, Soni, Achint, Siu, Sherman, Chen, Haonan, Chandra, Abhranil, Jiang, Ziyan, Arulraj, Aaran, Wang, Kai, Do, Quy Duc, Ni, Yuansheng, Lyu, Bohan, Narsupalli, Yaswanth, Fan, Rongqi, Lyu, Zhiheng, Lin, Yuchen, Chen, Wenhu
The recent years have witnessed great advances in video generation. However, the development of automatic video metrics is lagging significantly behind. None of the existing metric is able to provide reliable scores over generated videos. The main ba
Externí odkaz:
http://arxiv.org/abs/2406.15252
Recent breakthroughs in vision-language models (VLMs) emphasize the necessity of benchmarking human preferences in real-world multimodal interactions. To address this gap, we launched WildVision-Arena (WV-Arena), an online platform that collects huma
Externí odkaz:
http://arxiv.org/abs/2406.11069
Publikováno v:
NeurIPS 2024
Generative AI has made remarkable strides to revolutionize fields such as image and video generation. These advancements are driven by innovative algorithms, architecture, and data. However, the rapid proliferation of generative models has highlighte
Externí odkaz:
http://arxiv.org/abs/2406.04485
Publikováno v:
Transactions on Machine Learning Research 2024
Large multimodal models (LMMs) have shown great results in single-image vision language tasks. However, their abilities to solve multi-image visual language tasks is yet to be improved. The existing LMMs like OpenFlamingo, Emu2, and Idefics gain thei
Externí odkaz:
http://arxiv.org/abs/2405.01483
In the rapidly advancing field of conditional image generation research, challenges such as limited explainability lie in effectively evaluating the performance and capabilities of various models. This paper introduces VIEScore, a Visual Instruction-
Externí odkaz:
http://arxiv.org/abs/2312.14867
Autor:
Yue, Xiang, Ni, Yuansheng, Zhang, Kai, Zheng, Tianyu, Liu, Ruoqi, Zhang, Ge, Stevens, Samuel, Jiang, Dongfu, Ren, Weiming, Sun, Yuxuan, Wei, Cong, Yu, Botao, Yuan, Ruibin, Sun, Renliang, Yin, Ming, Zheng, Boyuan, Yang, Zhenzhu, Liu, Yibo, Huang, Wenhao, Sun, Huan, Su, Yu, Chen, Wenhu
We introduce MMMU: a new benchmark designed to evaluate multimodal models on massive multi-discipline tasks demanding college-level subject knowledge and deliberate reasoning. MMMU includes 11.5K meticulously collected multimodal questions from colle
Externí odkaz:
http://arxiv.org/abs/2311.16502
We present TIGERScore, a \textbf{T}rained metric that follows \textbf{I}nstruction \textbf{G}uidance to perform \textbf{E}xplainable, and \textbf{R}eference-free evaluation over a wide spectrum of text generation tasks. Different from other automatic
Externí odkaz:
http://arxiv.org/abs/2310.00752
We present LLM-Blender, an ensembling framework designed to attain consistently superior performance by leveraging the diverse strengths of multiple open-source large language models (LLMs). Our framework consists of two modules: PairRanker and GenFu
Externí odkaz:
http://arxiv.org/abs/2306.02561
Pre-trained language models have been successful in natural language generation (NLG) tasks. While various decoding methods have been employed, they often produce suboptimal results. We first present an empirical analysis of three NLG tasks: summariz
Externí odkaz:
http://arxiv.org/abs/2212.10555