Zobrazeno 1 - 10
of 450
pro vyhledávání: '"Bao, Bing"'
Multimedia recommender systems focus on utilizing behavioral information and content information to model user preferences. Typically, it employs pre-trained feature encoders to extract content features, then fuses them with behavioral features. Howe
Externí odkaz:
http://arxiv.org/abs/2406.00323
Story visualization aims to generate a series of realistic and coherent images based on a storyline. Current models adopt a frame-by-frame architecture by transforming the pre-trained text-to-image model into an auto-regressive manner. Although these
Externí odkaz:
http://arxiv.org/abs/2404.05979
Multimedia recommendation aims to predict users' future behaviors based on historical behavioral data and item's multimodal information. However, noise inherent in behavioral data, arising from unintended user interactions with uninteresting items, d
Externí odkaz:
http://arxiv.org/abs/2309.15363
Publikováno v:
Data Min Knowl Disc (2024)
Infectious disease forecasting has been a key focus and proved to be crucial in controlling epidemic. A recent trend is to develop forecast-ing models based on graph neural networks (GNNs). However, existing GNN-based methods suffer from two key limi
Externí odkaz:
http://arxiv.org/abs/2308.15840
Multimedia recommendation has received much attention in recent years. It models user preferences based on both behavior information and item multimodal information. Though current GCN-based methods achieve notable success, they suffer from two limit
Externí odkaz:
http://arxiv.org/abs/2308.03588
Synthesizing high-fidelity complex images from text is challenging. Based on large pretraining, the autoregressive and diffusion models can synthesize photo-realistic images. Although these large models have shown notable progress, there remain three
Externí odkaz:
http://arxiv.org/abs/2301.12959
Text-guided image editing models have shown remarkable results. However, there remain two problems. First, they employ fixed manipulation modules for various editing requirements (e.g., color changing, texture changing, content adding and removing),
Externí odkaz:
http://arxiv.org/abs/2206.01160
Publikováno v:
IEEE Transactions on Multimedia 2021
Video question answering is a challenging task, which requires agents to be able to understand rich video contents and perform spatial-temporal reasoning. However, existing graph-based methods fail to perform multi-step reasoning well, neglecting two
Externí odkaz:
http://arxiv.org/abs/2107.04768
Autor:
Wang, Qijian, Zhang, Xiao, Suo, Yijun, Chen, Zhiying, Wu, Moxin, Wen, Xiaoqin, Lai, Qin, Yin, Xiaoping, Bao, Bing
Publikováno v:
In Heliyon 15 January 2024 10(1)
Synthesizing high-quality realistic images from text descriptions is a challenging task. Existing text-to-image Generative Adversarial Networks generally employ a stacked architecture as the backbone yet still remain three flaws. First, the stacked a
Externí odkaz:
http://arxiv.org/abs/2008.05865