Výsledky vyhledávání - "Song, Xuemeng"

Report

Pseudo-triplet Guided Few-shot Composed Image Retrieval

Autor: Hou, Bohan, Lin, Haoqiang, Wen, Haokun, Liu, Meng, Song, Xuemeng

Composed Image Retrieval (CIR) is a challenging task that aims to retrieve the target image based on a multimodal query, i.e., a reference image and its corresponding modification text. While previous supervised or zero-shot learning paradigms all fa

Externí odkaz: http://arxiv.org/abs/2407.06001

Zobrazit plný text záznamu

Report

MMGRec: Multimodal Generative Recommendation with Transformer Model

Autor: Liu, Han, Wei, Yinwei, Song, Xuemeng, Guan, Weili, Li, Yuan-Fang, Nie, Liqiang

Multimodal recommendation aims to recommend user-preferred candidates based on her/his historically interacted items and associated multimodal information. Previous studies commonly employ an embed-and-retrieve paradigm: learning user and item repres

Externí odkaz: http://arxiv.org/abs/2404.16555

Zobrazit plný text záznamu

Report

Simple but Effective Raw-Data Level Multimodal Fusion for Composed Image Retrieval

Autor: Wen, Haokun, Song, Xuemeng, Chen, Xiaolin, Wei, Yinwei, Nie, Liqiang, Chua, Tat-Seng

Composed image retrieval (CIR) aims to retrieve the target image based on a multimodal query, i.e., a reference image paired with corresponding modification text. Recent CIR studies leverage vision-language pre-trained (VLP) methods as the feature ex

Externí odkaz: http://arxiv.org/abs/2404.15875

Zobrazit plný text záznamu

Report

Interactive Garment Recommendation with User in the Loop

Autor: Becattini, Federico, Chen, Xiaolin, Puccia, Andrea, Wen, Haokun, Song, Xuemeng, Nie, Liqiang, Del Bimbo, Alberto

Recommending fashion items often leverages rich user profiles and makes targeted suggestions based on past history and previous purchases. In this paper, we work under the assumption that no prior knowledge is given about a user. We propose to build

Externí odkaz: http://arxiv.org/abs/2402.11627

Zobrazit plný text záznamu

Report

Sentiment-enhanced Graph-based Sarcasm Explanation in Dialogue

Autor: Ouyang, Kun, Jing, Liqiang, Song, Xuemeng, Liu, Meng, Hu, Yupeng, Nie, Liqiang

Sarcasm Explanation in Dialogue (SED) is a new yet challenging task, which aims to generate a natural language explanation for the given sarcastic dialogue that involves multiple modalities (i.e., utterance, video, and audio). Although existing studi

Externí odkaz: http://arxiv.org/abs/2402.03658

Zobrazit plný text záznamu

Report

Prompt-based Multi-interest Learning Method for Sequential Recommendation

Autor: Dong, Xue, Song, Xuemeng, Liu, Tongliang, Guan, Weili

Multi-interest learning method for sequential recommendation aims to predict the next item according to user multi-faceted interests given the user historical interactions. Existing methods mainly consist of a multi-interest extractor that embeds the

Externí odkaz: http://arxiv.org/abs/2401.04312

Zobrazit plný text záznamu

Report

VK-G2T: Vision and Context Knowledge enhanced Gloss2Text

Autor: Jing, Liqiang, Song, Xuemeng, Zu, Xinxing, Zheng, Na, Zhao, Zhongzhou, Nie, Liqiang

Existing sign language translation methods follow a two-stage pipeline: first converting the sign language video to a gloss sequence (i.e. Sign2Gloss) and then translating the generated gloss sequence into a spoken language sentence (i.e. Gloss2Text)

Externí odkaz: http://arxiv.org/abs/2312.10210

Zobrazit plný text záznamu

Report

Target-Guided Composed Image Retrieval

Autor: Wen, Haokun, Zhang, Xian, Song, Xuemeng, Wei, Yinwei, Nie, Liqiang

Publikováno v: ACM Multimedia 2023

Composed image retrieval (CIR) is a new and flexible image retrieval paradigm, which can retrieve the target image for a multimodal query, including a reference image and its corresponding modification text. Although existing efforts have achieved co

Externí odkaz: http://arxiv.org/abs/2309.01366

Zobrazit plný text záznamu

Kniha

Compatibility Modeling : Data and Knowledge Applications for Clothing Matching. [elektronicky zdroj]

Autor: Song, Xuemeng

Externí odkaz: Kolekce e-knih KNAV Registrovani uzivatele: plny text online 5 minut, dalsi pristup na vyzadani. Registered users: full text online 5 minutes, further access on request.

Report

Multi-source Semantic Graph-based Multimodal Sarcasm Explanation Generation

Autor: Jing, Liqiang, Song, Xuemeng, Ouyang, Kun, Jia, Mengzhao, Nie, Liqiang

Publikováno v: ACL 2023

Multimodal Sarcasm Explanation (MuSE) is a new yet challenging task, which aims to generate a natural language sentence for a multimodal social post (an image as well as its caption) to explain why it contains sarcasm. Although the existing pioneer s

Externí odkaz: http://arxiv.org/abs/2306.16650

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání