Zobrazeno 1 - 10
of 1 652
pro vyhledávání: '"Hengel, P."'
Autor:
Cong, Gaoxiang, Pan, Jiadong, Li, Liang, Qi, Yuankai, Peng, Yuxin, Hengel, Anton van den, Yang, Jian, Huang, Qingming
Given a piece of text, a video clip, and a reference audio, the movie dubbing task aims to generate speech that aligns with the video while cloning the desired voice. The existing methods have two primary deficiencies: (1) They struggle to simultaneo
Externí odkaz:
http://arxiv.org/abs/2412.08988
Autor:
Chen, Qi, Zhao, Ruoshan, Wang, Sinuo, Phan, Vu Minh Hieu, Hengel, Anton van den, Verjans, Johan, Liao, Zhibin, To, Minh-Son, Xia, Yong, Chen, Jian, Xie, Yutong, Wu, Qi
Medical vision-and-language models (MVLMs) have attracted substantial interest due to their capability to offer a natural language interface for interpreting complex medical data. Their applications are versatile and have the potential to improve dia
Externí odkaz:
http://arxiv.org/abs/2411.12195
Autor:
Cao, Haiyao, Zou, Jinan, Liu, Yuhang, Zhang, Zhen, Abbasnejad, Ehsan, Hengel, Anton van den, Shi, Javen Qinfeng
Accurately predicting stock returns is crucial for effective portfolio management. However, existing methods often overlook a fundamental issue in the market, namely, distribution shifts, making them less practical for predicting future markets or ne
Externí odkaz:
http://arxiv.org/abs/2409.00671
Autor:
Cao, Haiyao, Zhang, Zhen, Cai, Panpan, Liu, Yuhang, Zou, Jinan, Abbasnejad, Ehsan, Huang, Biwei, Gong, Mingming, Hengel, Anton van den, Shi, Javen Qinfeng
One of the significant challenges in reinforcement learning (RL) when dealing with noise is estimating latent states from observations. Causality provides rigorous theoretical support for ensuring that the underlying states can be uniquely recovered
Externí odkaz:
http://arxiv.org/abs/2408.13498
Autor:
Chowdhury, Townim F., Phan, Vu Minh Hieu, Liao, Kewen, To, Minh-Son, Xie, Yutong, Hengel, Anton van den, Verjans, Johan W., Liao, Zhibin
The integration of vision-language models such as CLIP and Concept Bottleneck Models (CBMs) offers a promising approach to explaining deep neural network (DNN) decisions using concepts understandable by humans, addressing the black-box concern of DNN
Externí odkaz:
http://arxiv.org/abs/2408.02001
Automatically generating symbolic music-music scores tailored to specific human needs-can be highly beneficial for musicians and enthusiasts. Recent studies have shown promising results using extensive datasets and advanced transformer architectures.
Externí odkaz:
http://arxiv.org/abs/2407.04331
Autor:
Zhang, Frederic Z., Albert, Paul, Rodriguez-Opazo, Cristian, Hengel, Anton van den, Abbasnejad, Ehsan
Pre-trained models produce strong generic representations that can be adapted via fine-tuning. The learned weight difference relative to the pre-trained model, known as a task vector, characterises the direction and stride of fine-tuning. The signifi
Externí odkaz:
http://arxiv.org/abs/2407.02880
Autor:
Rodriguez-Opazo, Cristian, Abbasnejad, Ehsan, Teney, Damien, Marrese-Taylor, Edison, Damirchi, Hamed, Hengel, Anton van den
Contrastive Language-Image Pretraining (CLIP) stands out as a prominent method for image representation learning. Various architectures, from vision transformers (ViTs) to convolutional networks (ResNets) have been trained with CLIP to serve as gener
Externí odkaz:
http://arxiv.org/abs/2405.17139
For many recommender systems, the primary data source is a historical record of user clicks. The associated click matrix is often very sparse, as the number of users x products can be far larger than the number of clicks. Such sparsity is accentuated
Externí odkaz:
http://arxiv.org/abs/2404.13298
Autor:
Chowdhury, Townim Faisal, Liao, Kewen, Phan, Vu Minh Hieu, To, Minh-Son, Xie, Yutong, Hung, Kevin, Ross, David, Hengel, Anton van den, Verjans, Johan W., Liao, Zhibin
Deep Neural Networks (DNNs) are widely used for visual classification tasks, but their complex computation process and black-box nature hinder decision transparency and interpretability. Class activation maps (CAMs) and recent variants provide ways t
Externí odkaz:
http://arxiv.org/abs/2404.02388