Výsledky vyhledávání

Report

Revolutionizing Text-to-Image Retrieval as Autoregressive Token-to-Voken Generation

Autor: Li, Yongqi, Cai, Hongru, Wang, Wenjie, Qu, Leigang, Wei, Yinwei, Li, Wenjie, Nie, Liqiang, Chua, Tat-Seng

Text-to-image retrieval is a fundamental task in multimedia processing, aiming to retrieve semantically relevant cross-modal content. Traditional studies have typically approached this task as a discriminative problem, matching the text and image via

Externí odkaz: http://arxiv.org/abs/2407.17274

Zobrazit plný text záznamu

Report

Harnessing Large Language Models for Multimodal Product Bundling

Autor: Liu, Xiaohao, Wu, Jie, Tao, Zhulin, Ma, Yunshan, Wei, Yinwei, Chua, Tat-seng

Product bundling provides clients with a strategic combination of individual items. And it has gained significant attention in recent years as a fundamental prerequisite for online services. Recent methods utilize multimodal information through sophi

Externí odkaz: http://arxiv.org/abs/2407.11712

Zobrazit plný text záznamu

Report

PromptDSI: Prompt-based Rehearsal-free Instance-wise Incremental Learning for Document Retrieval

Autor: Huynh, Tuan-Luc, Vu, Thuy-Trang, Wang, Weiqing, Wei, Yinwei, Le, Trung, Gasevic, Dragan, Li, Yuan-Fang, Do, Thanh-Toan

Differentiable Search Index (DSI) utilizes Pre-trained Language Models (PLMs) for efficient document retrieval without relying on external indexes. However, DSIs need full re-training to handle updates in dynamic corpora, causing significant computat

Externí odkaz: http://arxiv.org/abs/2406.12593

Zobrazit plný text záznamu

Report

High-level Codes and Fine-grained Weights for Online Multi-modal Hashing Retrieval

Autor: Zhan, Yu-Wei, Wu, Xiao-Ming, Luo, Xin, Wei, Yinwei, Xu, Xin-Shun

In the real world, multi-modal data often appears in a streaming fashion, and there is a growing demand for similarity retrieval from such non-stationary data, especially at a large scale. In response to this need, online multi-modal hashing has gain

Externí odkaz: http://arxiv.org/abs/2406.10776

Zobrazit plný text záznamu

Report

MMGRec: Multimodal Generative Recommendation with Transformer Model

Autor: Liu, Han, Wei, Yinwei, Song, Xuemeng, Guan, Weili, Li, Yuan-Fang, Nie, Liqiang

Multimodal recommendation aims to recommend user-preferred candidates based on her/his historically interacted items and associated multimodal information. Previous studies commonly employ an embed-and-retrieve paradigm: learning user and item repres

Externí odkaz: http://arxiv.org/abs/2404.16555

Zobrazit plný text záznamu

Report

Simple but Effective Raw-Data Level Multimodal Fusion for Composed Image Retrieval

Autor: Wen, Haokun, Song, Xuemeng, Chen, Xiaolin, Wei, Yinwei, Nie, Liqiang, Chua, Tat-Seng

Composed image retrieval (CIR) aims to retrieve the target image based on a multimodal query, i.e., a reference image paired with corresponding modification text. Recent CIR studies leverage vision-language pre-trained (VLP) methods as the feature ex

Externí odkaz: http://arxiv.org/abs/2404.15875

Zobrazit plný text záznamu

Report

Double Mixture: Towards Continual Event Detection from Speech

Autor: Kang, Jingqi, Wu, Tongtong, Zhao, Jinming, Wang, Guitao, Wei, Yinwei, Yang, Hao, Qi, Guilin, Li, Yuan-Fang, Haffari, Gholamreza

Speech event detection is crucial for multimedia retrieval, involving the tagging of both semantic and acoustic events. Traditional ASR systems often overlook the interplay between these events, focusing solely on content, even though the interpretat

Externí odkaz: http://arxiv.org/abs/2404.13289

Zobrazit plný text záznamu

Report

Data-efficient Fine-tuning for LLM-based Recommendation

Autor: Lin, Xinyu, Wang, Wenjie, Li, Yongqi, Yang, Shuo, Feng, Fuli, Wei, Yinwei, Chua, Tat-Seng

Leveraging Large Language Models (LLMs) for recommendation has recently garnered considerable attention, where fine-tuning plays a key role in LLMs' adaptation. However, the cost of fine-tuning LLMs on rapidly expanding recommendation data limits the

Externí odkaz: http://arxiv.org/abs/2401.17197

Zobrazit plný text záznamu

Report

ToDA: Target-oriented Diffusion Attacker against Recommendation System

Autor: Liu, Xiaohao, Tao, Zhulin, Jiang, Ting, Chang, He, Ma, Yunshan, Wei, Yinwei, Wang, Xiang

Recommendation systems (RS) have become indispensable tools for web services to address information overload, thus enhancing user experiences and bolstering platforms' revenues. However, with their increasing ubiquity, security concerns have also eme

Externí odkaz: http://arxiv.org/abs/2401.12578

Zobrazit plný text záznamu

Report

Instilling Multi-round Thinking to Text-guided Image Generation

Autor: Zeng, Lidong, Zheng, Zhedong, Wei, Yinwei, Chua, Tat-seng

This paper delves into the text-guided image editing task, focusing on modifying a reference image according to user-specified textual feedback to embody specific attributes. Despite recent advancements, a persistent challenge remains that the single

Externí odkaz: http://arxiv.org/abs/2401.08472

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání