Výsledky vyhledávání

Report

VideoAutoArena: An Automated Arena for Evaluating Large Multimodal Models in Video Analysis through User Simulation

Autor: Luo, Ziyang, Wu, Haoning, Li, Dongxu, Ma, Jing, Kankanhalli, Mohan, Li, Junnan

Large multimodal models (LMMs) with advanced video analysis capabilities have recently garnered significant attention. However, most evaluations rely on traditional methods like multiple-choice questions in benchmarks such as VideoMME and LongVideoBe

Externí odkaz: http://arxiv.org/abs/2411.13281

Zobrazit plný text záznamu

Report

PMoL: Parameter Efficient MoE for Preference Mixing of LLM Alignment

Autor: Liu, Dongxu, Xu, Bing, Chen, Yinzhuo, Xu, Bufan, Lu, Wenpeng, Yang, Muyun, Zhao, Tiejun

Reinforcement Learning from Human Feedback (RLHF) has been proven to be an effective method for preference alignment of large language models (LLMs) and is widely used in the post-training process of LLMs. However, RLHF struggles with handling multip

Externí odkaz: http://arxiv.org/abs/2411.01245

Zobrazit plný text záznamu

Report

Aria: An Open Multimodal Native Mixture-of-Experts Model

Autor: Li, Dongxu, Liu, Yudong, Wu, Haoning, Wang, Yue, Shen, Zhiqi, Qu, Bowen, Niu, Xinyao, Wang, Guoyin, Chen, Bei, Li, Junnan

Information comes in diverse modalities. Multimodal native AI models are essential to integrate real-world information and deliver comprehensive understanding. While proprietary multimodal native models exist, their lack of openness imposes obstacles

Externí odkaz: http://arxiv.org/abs/2410.05993

Zobrazit plný text záznamu

Report

Formation and Eruption of Hot Channel Magnetic Flux Rope in Nested Double Null Magnetic System

Autor: Yao, Surui, Shen, Yuandeng, Zhou, Chengrui, Liu, Dongxu, Zhou, Xinping

The coronal magnetic topology significantly affects the outcome of magnetic flux rope (MFR) eruptions. The recently reported nested double null magnetic system remains unclear as to how it affects MFR eruptions. Using observations from the New Vacuum

Externí odkaz: http://arxiv.org/abs/2410.03100

Zobrazit plný text záznamu

Report

EZSR: Event-based Zero-Shot Recognition

Autor: Yang, Yan, Pan, Liyuan, Li, Dongxu, Liu, Liu

This paper studies zero-shot object recognition using event camera data. Guided by CLIP, which is pre-trained on RGB images, existing approaches achieve zero-shot object recognition by optimizing embedding similarities between event data and RGB imag

Externí odkaz: http://arxiv.org/abs/2407.21616

Zobrazit plný text záznamu

Report

LongVideoBench: A Benchmark for Long-context Interleaved Video-Language Understanding

Autor: Wu, Haoning, Li, Dongxu, Chen, Bei, Li, Junnan

Large multimodal models (LMMs) are processing increasingly longer and richer inputs. Albeit the progress, few public benchmark is available to measure such development. To mitigate this gap, we introduce LongVideoBench, a question-answering benchmark

Externí odkaz: http://arxiv.org/abs/2407.15754

Zobrazit plný text záznamu

Report

Enhancing Hallucination Detection through Perturbation-Based Synthetic Data Generation in System Responses

Autor: Zhang, Dongxu, Gangal, Varun, Lattimer, Barrett Martin, Yang, Yi

Detecting hallucinations in large language model (LLM) outputs is pivotal, yet traditional fine-tuning for this classification task is impeded by the expensive and quickly outdated annotation process, especially across numerous vertical domains and i

Externí odkaz: http://arxiv.org/abs/2407.05474

Zobrazit plný text záznamu

Report

STEVE Series: Step-by-Step Construction of Agent Systems in Minecraft

Autor: Zhao, Zhonghan, Chai, Wenhao, Wang, Xuan, Ma, Ke, Chen, Kewei, Guo, Dongxu, Ye, Tian, Zhang, Yanting, Wang, Hongwei, Wang, Gaoang

Building an embodied agent system with a large language model (LLM) as its core is a promising direction. Due to the significant costs and uncontrollable factors associated with deploying and training such agents in the real world, we have decided to

Externí odkaz: http://arxiv.org/abs/2406.11247

Zobrazit plný text záznamu

Report

PyramidMamba: Rethinking Pyramid Feature Fusion with Selective Space State Model for Semantic Segmentation of Remote Sensing Imagery

Autor: Wang, Libo, Li, Dongxu, Dong, Sijun, Meng, Xiaoliang, Zhang, Xiaokang, Hong, Danfeng

Semantic segmentation, as a basic tool for intelligent interpretation of remote sensing images, plays a vital role in many Earth Observation (EO) applications. Nowadays, accurate semantic segmentation of remote sensing images remains a challenge due

Externí odkaz: http://arxiv.org/abs/2406.10828

Zobrazit plný text záznamu

Report

A theoretical framework for multi-physics modeling of poro-visco-hyperelasticity-induced time-dependent fracture of blood clots

Autor: Liu, Dongxu, Nguyen, Nhung, Bui, Tinh Quoc, Pocivavsek, Luka

Fracture resistance of blood clots plays a crucial role in physiological hemostasis and pathological thromboembolism. Although recent experimental and computational studies uncovered the poro-viscoelastic property of blood clots and its connection to

Externí odkaz: http://arxiv.org/abs/2406.15432

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání