Výsledky vyhledávání

Report

LLaVA-MoD: Making LLaVA Tiny via MoE Knowledge Distillation

Autor: Shu, Fangxun, Liao, Yue, Zhuo, Le, Xu, Chenning, Zhang, Lei, Zhang, Guanghao, Shi, Haonan, Chen, Long, Zhong, Tao, He, Wanggui, Fu, Siming, Li, Haoyuan, Li, Bolin, Yu, Zhelun, Liu, Si, Li, Hongsheng, Jiang, Hao

We introduce LLaVA-MoD, a novel framework designed to enable the efficient training of small-scale Multimodal Language Models (s-MLLM) by distilling knowledge from large-scale MLLM (l-MLLM). Our approach tackles two fundamental challenges in MLLM dis

Externí odkaz: http://arxiv.org/abs/2408.15881

Zobrazit plný text záznamu

Report

TeamLoRA: Boosting Low-Rank Adaptation with Expert Collaboration and Competition

Autor: Lin, Tianwei, Liu, Jiang, Zhang, Wenqiao, Li, Zhaocheng, Dai, Yang, Li, Haoyuan, Yu, Zhelun, He, Wanggui, Li, Juncheng, Jiang, Hao, Tang, Siliang, Zhuang, Yueting

While Parameter-Efficient Fine-Tuning (PEFT) methods like LoRA have effectively addressed GPU memory constraints during fine-tuning, their performance often falls short, especially in multidimensional task scenarios. To address this issue, one straig

Externí odkaz: http://arxiv.org/abs/2408.09856

Zobrazit plný text záznamu

Report

MARS: Mixture of Auto-Regressive Models for Fine-grained Text-to-image Synthesis

Autor: He, Wanggui, Fu, Siming, Liu, Mushui, Wang, Xierui, Xiao, Wenyi, Shu, Fangxun, Wang, Yi, Zhang, Lei, Yu, Zhelun, Li, Haoyuan, Huang, Ziwei, Gan, LeiLei, Jiang, Hao

Auto-regressive models have made significant progress in the realm of language generation, yet they do not perform on par with diffusion models in the domain of image synthesis. In this work, we introduce MARS, a novel framework for T2I generation th

Externí odkaz: http://arxiv.org/abs/2407.07614

Zobrazit plný text záznamu

Report

Detecting and Mitigating Hallucination in Large Vision Language Models via Fine-Grained AI Feedback

Autor: Xiao, Wenyi, Huang, Ziwei, Gan, Leilei, He, Wanggui, Li, Haoyuan, Yu, Zhelun, Jiang, Hao, Wu, Fei, Zhu, Linchao

The rapidly developing Large Vision Language Models (LVLMs) have shown notable capabilities on a range of multi-modal tasks, but still face the hallucination phenomena where the generated texts do not align with the given contexts, significantly rest

Externí odkaz: http://arxiv.org/abs/2404.14233

Zobrazit plný text záznamu

Report

TrainerAgent: Customizable and Efficient Model Training through LLM-Powered Multi-Agent System

Autor: Li, Haoyuan, Jiang, Hao, Zhang, Tianke, Yu, Zhelun, Yin, Aoxiong, Cheng, Hao, Fu, Siming, Zhang, Yuhao, He, Wanggui

Training AI models has always been challenging, especially when there is a need for custom models to provide personalized services. Algorithm engineers often face a lengthy process to iteratively develop models tailored to specific business requireme

Externí odkaz: http://arxiv.org/abs/2311.06622

Zobrazit plný text záznamu

Fast compressive sensing reconstruction algorithm on FPGA using Orthogonal Matching Pursuit

Autor: Yangfeng Su, Weiping Shi, Yu Zhelun, Fan Yang, Jincheng Su, Dian Zhou, Xuan Zeng

Publikováno v: ISCAS

This paper presents a fast compressive sensing reconstruction algorithm implemented on FPGA using Orthogonal Matching Pursuit (OMP). The algorithm is optimized with QR decomposition to solve the least square problem and avoids the square root operati

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::0cb08a0b17e3b2bb1b85861e8a25b825
https://doi.org/10.1109/iscas.2016.7527217

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání