Výsledky vyhledávání

Report

VideoScore: Building Automatic Metrics to Simulate Fine-grained Human Feedback for Video Generation

Autor: He, Xuan, Jiang, Dongfu, Zhang, Ge, Ku, Max, Soni, Achint, Siu, Sherman, Chen, Haonan, Chandra, Abhranil, Jiang, Ziyan, Arulraj, Aaran, Wang, Kai, Do, Quy Duc, Ni, Yuansheng, Lyu, Bohan, Narsupalli, Yaswanth, Fan, Rongqi, Lyu, Zhiheng, Lin, Yuchen, Chen, Wenhu

The recent years have witnessed great advances in video generation. However, the development of automatic video metrics is lagging significantly behind. None of the existing metric is able to provide reliable scores over generated videos. The main ba

Externí odkaz: http://arxiv.org/abs/2406.15252

Zobrazit plný text záznamu

Report

GenAI Arena: An Open Evaluation Platform for Generative Models

Autor: Jiang, Dongfu, Ku, Max, Li, Tianle, Ni, Yuansheng, Sun, Shizhuo, Fan, Rongqi, Chen, Wenhu

Generative AI has made remarkable strides to revolutionize fields such as image and video generation. These advancements are driven by innovative algorithms, architecture, and data. However, the rapid proliferation of generative models has highlighte

Externí odkaz: http://arxiv.org/abs/2406.04485

Zobrazit plný text záznamu

Report

MMLU-Pro: A More Robust and Challenging Multi-Task Language Understanding Benchmark

Autor: Wang, Yubo, Ma, Xueguang, Zhang, Ge, Ni, Yuansheng, Chandra, Abhranil, Guo, Shiguang, Ren, Weiming, Arulraj, Aaran, He, Xuan, Jiang, Ziyan, Li, Tianle, Ku, Max, Wang, Kai, Zhuang, Alex, Fan, Rongqi, Yue, Xiang, Chen, Wenhu

In the age of large-scale language models, benchmarks like the Massive Multitask Language Understanding (MMLU) have been pivotal in pushing the boundaries of what AI can achieve in language comprehension and reasoning across diverse domains. However,

Externí odkaz: http://arxiv.org/abs/2406.01574

Zobrazit plný text záznamu

Report

MANTIS: Interleaved Multi-Image Instruction Tuning

Autor: Jiang, Dongfu, He, Xuan, Zeng, Huaye, Wei, Cong, Ku, Max, Liu, Qian, Chen, Wenhu

Large multimodal models (LMMs) have shown great results in single-image vision language tasks. However, their abilities to solve multi-image visual language tasks is yet to be improved. The existing LMMs like OpenFlamingo, Emu2, Idefics gain their mu

Externí odkaz: http://arxiv.org/abs/2405.01483

Zobrazit plný text záznamu

Report

AnyV2V: A Tuning-Free Framework For Any Video-to-Video Editing Tasks

Autor: Ku, Max, Wei, Cong, Ren, Weiming, Yang, Harry, Chen, Wenhu

In the dynamic field of digital content creation using generative models, state-of-the-art video editing models still do not offer the level of quality and control that users desire. Previous works on video editing either extended from image-based ge

Externí odkaz: http://arxiv.org/abs/2403.14468

Zobrazit plný text záznamu

Report

VIEScore: Towards Explainable Metrics for Conditional Image Synthesis Evaluation

Autor: Ku, Max, Jiang, Dongfu, Wei, Cong, Yue, Xiang, Chen, Wenhu

In the rapidly advancing field of conditional image generation research, challenges such as limited explainability lie in effectively evaluating the performance and capabilities of various models. This paper introduces VIEScore, a Visual Instruction-

Externí odkaz: http://arxiv.org/abs/2312.14867

Zobrazit plný text záznamu

Report

ImagenHub: Standardizing the evaluation of conditional image generation models

Autor: Ku, Max, Li, Tianle, Zhang, Kai, Lu, Yujie, Fu, Xingyu, Zhuang, Wenwen, Chen, Wenhu

Recently, a myriad of conditional image generation and editing models have been developed to serve different downstream tasks, including text-to-image generation, text-guided image editing, subject-driven image generation, control-guided image genera

Externí odkaz: http://arxiv.org/abs/2310.01596

Zobrazit plný text záznamu

Report

DreamEdit: Subject-driven Image Editing

Autor: Li, Tianle, Ku, Max, Wei, Cong, Chen, Wenhu

Subject-driven image generation aims at generating images containing customized subjects, which has recently drawn enormous attention from the research community. However, the previous works cannot precisely control the background and position of the

Externí odkaz: http://arxiv.org/abs/2306.12624

Zobrazit plný text záznamu

Report

TheoremQA: A Theorem-driven Question Answering dataset

Autor: Chen, Wenhu, Yin, Ming, Ku, Max, Lu, Pan, Wan, Yixin, Ma, Xueguang, Xu, Jianyu, Wang, Xinyi, Xia, Tony

The recent LLMs like GPT-4 and PaLM-2 have made tremendous progress in solving fundamental math problems like GSM8K by achieving over 90% accuracy. However, their capabilities to solve more challenging math problems which require domain-specific know

Externí odkaz: http://arxiv.org/abs/2305.12524

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání