Výsledky vyhledávání - "Huang, Yilun"

Report

Img-Diff: Contrastive Data Synthesis for Multimodal Large Language Models

Autor: Jiao, Qirui, Chen, Daoyuan, Huang, Yilun, Li, Yaliang, Shen, Ying

High-performance Multimodal Large Language Models (MLLMs) rely heavily on data quality. This study introduces a novel dataset named Img-Diff, designed to enhance fine-grained image recognition in MLLMs by leveraging insights from contrastive learning

Externí odkaz: http://arxiv.org/abs/2408.04594

Zobrazit plný text záznamu

Report

Data-Juicer Sandbox: A Comprehensive Suite for Multimodal Data-Model Co-development

Autor: Chen, Daoyuan, Wang, Haibin, Huang, Yilun, Ge, Ce, Li, Yaliang, Ding, Bolin, Zhou, Jingren

The emergence of large-scale multi-modal generative models has drastically advanced artificial intelligence, introducing unprecedented levels of performance and functionality. However, optimizing these models remains challenging due to historically i

Externí odkaz: http://arxiv.org/abs/2407.11784

Zobrazit plný text záznamu

Report

The Synergy between Data and Multi-Modal Large Language Models: A Survey from Co-Development Perspective

Autor: Qin, Zhen, Chen, Daoyuan, Zhang, Wenhao, Yao, Liuyi, Huang, Yilun, Ding, Bolin, Li, Yaliang, Deng, Shuiguang

The rapid development of large language models (LLMs) has been witnessed in recent years. Based on the powerful LLMs, multi-modal LLMs (MLLMs) extend the modality from text to a broader spectrum of domains, attracting widespread attention due to the

Externí odkaz: http://arxiv.org/abs/2407.08583

Zobrazit plný text záznamu

Report

Enhancing Multimodal Large Language Models with Vision Detection Models: An Empirical Study

Autor: Jiao, Qirui, Chen, Daoyuan, Huang, Yilun, Li, Yaliang, Shen, Ying

Despite the impressive capabilities of Multimodal Large Language Models (MLLMs) in integrating text and image modalities, challenges remain in accurately interpreting detailed visual elements. This paper presents an empirical study on enhancing MLLMs

Externí odkaz: http://arxiv.org/abs/2401.17981

Zobrazit plný text záznamu

Report

Data-Juicer: A One-Stop Data Processing System for Large Language Models

Autor: Chen, Daoyuan, Huang, Yilun, Ma, Zhijian, Chen, Hesen, Pan, Xuchen, Ge, Ce, Gao, Dawei, Xie, Yuexiang, Liu, Zhaoyang, Gao, Jinyang, Li, Yaliang, Ding, Bolin, Zhou, Jingren

The immense evolution in Large Language Models (LLMs) has underscored the importance of massive, heterogeneous, and high-quality data. A data recipe is a mixture of data from different sources for training LLMs, which plays a vital role in LLMs' perf

Externí odkaz: http://arxiv.org/abs/2309.02033

Zobrazit plný text záznamu

Report

DeepMAD: Mathematical Architecture Design for Deep Convolutional Neural Network

Autor: Shen, Xuan, Wang, Yaohua, Lin, Ming, Huang, Yilun, Tang, Hao, Sun, Xiuyu, Wang, Yanzhi

The rapid advances in Vision Transformer (ViT) refresh the state-of-the-art performances in various vision tasks, overshadowing the conventional CNN-based models. This ignites a few recent striking-back research in the CNN world showing that pure CNN

Externí odkaz: http://arxiv.org/abs/2303.02165

Zobrazit plný text záznamu

Report

Enhancing Model Performance in Multilingual Information Retrieval with Comprehensive Data Engineering Techniques

Autor: Zhang, Qi, Yang, Zijian, Huang, Yilun, Chen, Ze, Cai, Zijian, Wang, Kangxu, Zheng, Jiewen, He, Jiarong, Gao, Jin

In this paper, we present our solution to the Multilingual Information Retrieval Across a Continuum of Languages (MIRACL) challenge of WSDM CUP 2023\footnote{https://project-miracl.github.io/}. Our solution focuses on enhancing the ranking stage, whe

Externí odkaz: http://arxiv.org/abs/2302.07010

Zobrazit plný text záznamu

Report

DAMO-YOLO : A Report on Real-Time Object Detection Design

Autor: Xu, Xianzhe, Jiang, Yiqi, Chen, Weihua, Huang, Yilun, Zhang, Yuan, Sun, Xiuyu

In this report, we present a fast and accurate object detection method dubbed DAMO-YOLO, which achieves higher performance than the state-of-the-art YOLO series. DAMO-YOLO is extended from YOLO with some new technologies, including Neural Architectur

Externí odkaz: http://arxiv.org/abs/2211.15444

Zobrazit plný text záznamu

Report

A Semantic Alignment System for Multilingual Query-Product Retrieval

Autor: Zhang, Qi, Yang, Zijian, Huang, Yilun, Chen, Ze, Cai, Zijian, Wang, Kangxu, Zheng, Jiewen, He, Jiarong, Gao, Jin

This paper mainly describes our winning solution (team name: www) to Amazon ESCI Challenge of KDD CUP 2022, which achieves a NDCG score of 0.9043 and wins the first place on task 1: the query-product ranking track. In this competition, participants a

Externí odkaz: http://arxiv.org/abs/2208.02958

Zobrazit plný text záznamu

Report

An Effective Way for Cross-Market Recommendation with Hybrid Pre-Ranking and Ranking Models

Autor: Zhang, Qi, Yang, Zijian, Huang, Yilun, He, Jiarong, Wang, Lixiang

The Cross-Market Recommendation task of WSDM CUP 2022 is about finding solutions to improve individual recommendation systems in resource-scarce target markets by leveraging data from similar high-resource source markets. Finally, our team OPDAI won

Externí odkaz: http://arxiv.org/abs/2203.00897

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání