Výsledky vyhledávání

Report

Hypergraph based Understanding for Document Semantic Entity Recognition

Autor: Li, Qiwei, Li, Zuchao, Wang, Ping, Ai, Haojun, Zhao, Hai

Semantic entity recognition is an important task in the field of visually-rich document understanding. It distinguishes the semantic types of text by analyzing the position relationship between text nodes and the relation between text content. The ex

Externí odkaz: http://arxiv.org/abs/2407.06904

Zobrazit plný text záznamu

Report

Venturing into Uncharted Waters: The Navigation Compass from Transformer to Mamba

Autor: Zou, Yuchen, Chen, Yineng, Li, Zuchao, Zhang, Lefei, Zhao, Hai

Transformer, a deep neural network architecture, has long dominated the field of natural language processing and beyond. Nevertheless, the recent introduction of Mamba challenges its supremacy, sparks considerable interest among researchers, and give

Externí odkaz: http://arxiv.org/abs/2406.16722

Zobrazit plný text záznamu

Report

The Music Maestro or The Musically Challenged, A Massive Music Evaluation Benchmark for Large Language Models

Autor: Li, Jiajia, Yang, Lu, Tang, Mingni, Chen, Cong, Li, Zuchao, Wang, Ping, Zhao, Hai

Benchmark plays a pivotal role in assessing the advancements of large language models (LLMs). While numerous benchmarks have been proposed to evaluate LLMs' capabilities, there is a notable absence of a dedicated benchmark for assessing their musical

Externí odkaz: http://arxiv.org/abs/2406.15885

Zobrazit plný text záznamu

Report

MGIMM: Multi-Granularity Instruction Multimodal Model for Attribute-Guided Remote Sensing Image Detailed Description

Autor: Yang, Cong, Li, Zuchao, Zhang, Lefei

Recently, large multimodal models have built a bridge from visual to textual information, but they tend to underperform in remote sensing scenarios. This underperformance is due to the complex distribution of objects and the significant scale differe

Externí odkaz: http://arxiv.org/abs/2406.04716

Zobrazit plný text záznamu

Report

GKT: A Novel Guidance-Based Knowledge Transfer Framework For Efficient Cloud-edge Collaboration LLM Deployment

Autor: Yao, Yao, Li, Zuchao, Zhao, Hai

The burgeoning size of Large Language Models (LLMs) has led to enhanced capabilities in generating responses, albeit at the expense of increased inference times and elevated resource demands. Existing methods of acceleration, predominantly hinged on

Externí odkaz: http://arxiv.org/abs/2405.19635

Zobrazit plný text záznamu

Report

SirLLM: Streaming Infinite Retentive LLM

Autor: Yao, Yao, Li, Zuchao, Zhao, Hai

As Large Language Models (LLMs) become increasingly prevalent in various domains, their ability to process inputs of any length and maintain a degree of memory becomes essential. However, the one-off input of overly long texts is limited, as studies

Externí odkaz: http://arxiv.org/abs/2405.12528

Zobrazit plný text záznamu

Report

DSDRNet: Disentangling Representation and Reconstruct Network for Domain Generalization

Autor: Yang, Juncheng, Li, Zuchao, Xie, Shuai, Yu, Wei, Li, Shijun

Domain generalization faces challenges due to the distribution shift between training and testing sets, and the presence of unseen target domains. Common solutions include domain alignment, meta-learning, data augmentation, or ensemble learning, all

Externí odkaz: http://arxiv.org/abs/2404.13848

Zobrazit plný text záznamu

Report

Cross-Modal Adapter: Parameter-Efficient Transfer Learning Approach for Vision-Language Models

Autor: Yang, Juncheng, Li, Zuchao, Xie, Shuai, Zhu, Weiping, Yu, Wei, Li, Shijun

Adapter-based parameter-efficient transfer learning has achieved exciting results in vision-language models. Traditional adapter methods often require training or fine-tuning, facing challenges such as insufficient samples or resource limitations. Wh

Externí odkaz: http://arxiv.org/abs/2404.12588

Zobrazit plný text záznamu

Report

Soft-Prompting with Graph-of-Thought for Multi-modal Representation Learning

Autor: Yang, Juncheng, Li, Zuchao, Xie, Shuai, Yu, Wei, Li, Shijun, Du, Bo

The chain-of-thought technique has been received well in multi-modal tasks. It is a step-by-step linear reasoning process that adjusts the length of the chain to improve the performance of generated prompts. However, human thought processes are predo

Externí odkaz: http://arxiv.org/abs/2404.04538

Zobrazit plný text záznamu

Report

Multi-modal Auto-regressive Modeling via Visual Words

Autor: Peng, Tianshuo, Li, Zuchao, Zhang, Lefei, Zhao, Hai, Wang, Ping, Du, Bo

Large Language Models (LLMs), benefiting from the auto-regressive modelling approach performed on massive unannotated texts corpora, demonstrates powerful perceptual and reasoning capabilities. However, as for extending auto-regressive modelling to m

Externí odkaz: http://arxiv.org/abs/2403.07720

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání