Výsledky vyhledávání - "Meng, Gaofeng"

Report

A Survey of Low-shot Vision-Language Model Adaptation via Representer Theorem

Autor: Ding, Kun, Wang, Ying, Meng, Gaofeng, Xiang, Shiming

The advent of pre-trained vision-language foundation models has revolutionized the field of zero/few-shot (i.e., low-shot) image recognition. The key challenge to address under the condition of limited training data is how to fine-tune pre-trained vi

Externí odkaz: http://arxiv.org/abs/2410.11686

Zobrazit plný text záznamu

Report

Calibrated Cache Model for Few-Shot Vision-Language Model Adaptation

Autor: Ding, Kun, Yu, Qiang, Zhang, Haojian, Meng, Gaofeng, Xiang, Shiming

Cache-based approaches stand out as both effective and efficient for adapting vision-language models (VLMs). Nonetheless, the existing cache model overlooks three crucial aspects. 1) Pre-trained VLMs are mainly optimized for image-text similarity, ne

Externí odkaz: http://arxiv.org/abs/2410.08895

Zobrazit plný text záznamu

Report

Force Sensing Guided Artery-Vein Segmentation via Sequential Ultrasound Images

Autor: Geng, Yimeng, Meng, Gaofeng, Chen, Mingcong, Cao, Guanglin, Zhao, Mingyang, Zhao, Jianbo, Liu, Hongbin

Accurate identification of arteries and veins in ultrasound images is crucial for vascular examinations and interventions in robotics-assisted surgeries. However, current methods for ultrasound vessel segmentation face challenges in distinguishing be

Externí odkaz: http://arxiv.org/abs/2407.21394

Zobrazit plný text záznamu

Report

AddressCLIP: Empowering Vision-Language Models for City-wide Image Address Localization

Autor: Xu, Shixiong, Zhang, Chenghao, Fan, Lubin, Meng, Gaofeng, Xiang, Shiming, Ye, Jieping

In this study, we introduce a new problem raised by social media and photojournalism, named Image Address Localization (IAL), which aims to predict the readable textual address where an image was taken. Existing two-stage approaches involve predictin

Externí odkaz: http://arxiv.org/abs/2407.08156

Zobrazit plný text záznamu

Report

Correspondence-Free Non-Rigid Point Set Registration Using Unsupervised Clustering Analysis

Autor: Zhao, Mingyang, Jiang, Jingen, Ma, Lei, Xin, Shiqing, Meng, Gaofeng, Yan, Dong-Ming

This paper presents a novel non-rigid point set registration method that is inspired by unsupervised clustering analysis. Unlike previous approaches that treat the source and target point sets as separate entities, we develop a holistic framework whe

Externí odkaz: http://arxiv.org/abs/2406.18817

Zobrazit plný text záznamu

Report

A Multimodal Transformer for Live Streaming Highlight Prediction

Autor: Deng, Jiaxin, Wang, Shiyao, Shen, Dong, Zhao, Liqin, Yang, Fan, Zhou, Guorui, Meng, Gaofeng

Recently, live streaming platforms have gained immense popularity. Traditional video highlight detection mainly focuses on visual features and utilizes both past and future content for prediction. However, live streaming requires models to infer with

Externí odkaz: http://arxiv.org/abs/2407.12002

Zobrazit plný text záznamu

Report

MMBee: Live Streaming Gift-Sending Recommendations via Multi-Modal Fusion and Behaviour Expansion

Autor: Deng, Jiaxin, Wang, Shiyao, Wang, Yuchen, Qi, Jiansong, Zhao, Liqin, Zhou, Guorui, Meng, Gaofeng

Live streaming services are becoming increasingly popular due to real-time interactions and entertainment. Viewers can chat and send comments or virtual gifts to express their preferences for the streamers. Accurately modeling the gifting interaction

Externí odkaz: http://arxiv.org/abs/2407.00056

Zobrazit plný text záznamu

Report

Xwin-LM: Strong and Scalable Alignment Practice for LLMs

Autor: Ni, Bolin, Hu, JingCheng, Wei, Yixuan, Peng, Houwen, Zhang, Zheng, Meng, Gaofeng, Hu, Han

In this work, we present Xwin-LM, a comprehensive suite of alignment methodologies for large language models (LLMs). This suite encompasses several key techniques, including supervised finetuning (SFT), reward modeling (RM), rejection sampling finetu

Externí odkaz: http://arxiv.org/abs/2405.20335

Zobrazit plný text záznamu

Report

Reusable Architecture Growth for Continual Stereo Matching

Autor: Zhang, Chenghao, Meng, Gaofeng, Fan, Bin, Tian, Kun, Zhang, Zhaoxiang, Xiang, Shiming, Pan, Chunhong

The remarkable performance of recent stereo depth estimation models benefits from the successful use of convolutional neural networks to regress dense disparity. Akin to most tasks, this needs gathering training data that covers a number of heterogen

Externí odkaz: http://arxiv.org/abs/2404.00360

Zobrazit plný text záznamu

Report

Enhancing Visual Continual Learning with Language-Guided Supervision

Autor: Ni, Bolin, Zhao, Hongbo, Zhang, Chenghao, Hu, Ke, Meng, Gaofeng, Zhang, Zhaoxiang, Xiang, Shiming

Continual learning (CL) aims to empower models to learn new tasks without forgetting previously acquired knowledge. Most prior works concentrate on the techniques of architectures, replay data, regularization, \etc. However, the category name of each

Externí odkaz: http://arxiv.org/abs/2403.16124

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání