Zobrazeno 1 - 9
of 9
pro vyhledávání: '"Ku, Max"'
Autor:
He, Xuan, Jiang, Dongfu, Zhang, Ge, Ku, Max, Soni, Achint, Siu, Sherman, Chen, Haonan, Chandra, Abhranil, Jiang, Ziyan, Arulraj, Aaran, Wang, Kai, Do, Quy Duc, Ni, Yuansheng, Lyu, Bohan, Narsupalli, Yaswanth, Fan, Rongqi, Lyu, Zhiheng, Lin, Yuchen, Chen, Wenhu
The recent years have witnessed great advances in video generation. However, the development of automatic video metrics is lagging significantly behind. None of the existing metric is able to provide reliable scores over generated videos. The main ba
Externí odkaz:
http://arxiv.org/abs/2406.15252
Generative AI has made remarkable strides to revolutionize fields such as image and video generation. These advancements are driven by innovative algorithms, architecture, and data. However, the rapid proliferation of generative models has highlighte
Externí odkaz:
http://arxiv.org/abs/2406.04485
Autor:
Wang, Yubo, Ma, Xueguang, Zhang, Ge, Ni, Yuansheng, Chandra, Abhranil, Guo, Shiguang, Ren, Weiming, Arulraj, Aaran, He, Xuan, Jiang, Ziyan, Li, Tianle, Ku, Max, Wang, Kai, Zhuang, Alex, Fan, Rongqi, Yue, Xiang, Chen, Wenhu
In the age of large-scale language models, benchmarks like the Massive Multitask Language Understanding (MMLU) have been pivotal in pushing the boundaries of what AI can achieve in language comprehension and reasoning across diverse domains. However,
Externí odkaz:
http://arxiv.org/abs/2406.01574
Large multimodal models (LMMs) have shown great results in single-image vision language tasks. However, their abilities to solve multi-image visual language tasks is yet to be improved. The existing LMMs like OpenFlamingo, Emu2, Idefics gain their mu
Externí odkaz:
http://arxiv.org/abs/2405.01483
In the dynamic field of digital content creation using generative models, state-of-the-art video editing models still do not offer the level of quality and control that users desire. Previous works on video editing either extended from image-based ge
Externí odkaz:
http://arxiv.org/abs/2403.14468
In the rapidly advancing field of conditional image generation research, challenges such as limited explainability lie in effectively evaluating the performance and capabilities of various models. This paper introduces VIEScore, a Visual Instruction-
Externí odkaz:
http://arxiv.org/abs/2312.14867
Recently, a myriad of conditional image generation and editing models have been developed to serve different downstream tasks, including text-to-image generation, text-guided image editing, subject-driven image generation, control-guided image genera
Externí odkaz:
http://arxiv.org/abs/2310.01596
Subject-driven image generation aims at generating images containing customized subjects, which has recently drawn enormous attention from the research community. However, the previous works cannot precisely control the background and position of the
Externí odkaz:
http://arxiv.org/abs/2306.12624
Autor:
Chen, Wenhu, Yin, Ming, Ku, Max, Lu, Pan, Wan, Yixin, Ma, Xueguang, Xu, Jianyu, Wang, Xinyi, Xia, Tony
The recent LLMs like GPT-4 and PaLM-2 have made tremendous progress in solving fundamental math problems like GSM8K by achieving over 90% accuracy. However, their capabilities to solve more challenging math problems which require domain-specific know
Externí odkaz:
http://arxiv.org/abs/2305.12524