Zobrazeno 1 - 9
of 9
pro vyhledávání: '"Du, Chenzhuang"'
With the growing success of multi-modal learning, research on the robustness of multi-modal models, especially when facing situations with missing modalities, is receiving increased attention. Nevertheless, previous studies in this domain exhibit cer
Externí odkaz:
http://arxiv.org/abs/2310.06383
This paper investigates how to better leverage large-scale pre-trained uni-modal models to further enhance discriminative multi-modal learning. Even when fine-tuned with only uni-modal data, these models can outperform previous multi-modal models in
Externí odkaz:
http://arxiv.org/abs/2310.05193
Large language models (LLMs) with memory are computationally universal. However, mainstream LLMs are not taking full advantage of memory, and the designs are heavily influenced by biological brains. Due to their approximate nature and proneness to th
Externí odkaz:
http://arxiv.org/abs/2306.03901
Autor:
Du, Chenzhuang, Teng, Jiaye, Li, Tingle, Liu, Yichen, Yuan, Tianyuan, Wang, Yue, Yuan, Yang, Zhao, Hang
We abstract the features (i.e. learned representations) of multi-modal data into 1) uni-modal features, which can be learned from uni-modal training, and 2) paired features, which can only be learned from cross-modal interactions. Multi-modal models
Externí odkaz:
http://arxiv.org/abs/2305.01233
In vision-based reinforcement learning (RL) tasks, it is prevalent to assign auxiliary tasks with a surrogate self-supervised loss so as to obtain more semantic representations and improve sample efficiency. However, abundant information in self-supe
Externí odkaz:
http://arxiv.org/abs/2106.13970
Learning multi-modal representations is an essential step towards real-world robotic applications, and various multi-modal fusion models have been developed for this purpose. However, we observe that existing models, whose objectives are mostly based
Externí odkaz:
http://arxiv.org/abs/2106.11059
The world provides us with data of multiple modalities. Intuitively, models fusing data from different modalities outperform their uni-modal counterparts, since more information is aggregated. Recently, joining the success of deep learning, there is
Externí odkaz:
http://arxiv.org/abs/2106.04538
In the classical multi-party computation setting, multiple parties jointly compute a function without revealing their own input data. We consider a variant of this problem, where the input data can be shared for machine learning training purposes, bu
Externí odkaz:
http://arxiv.org/abs/2009.11762
Publikováno v:
2022 International Conference on Robotics and Automation (ICRA).
In vision-based reinforcement learning (RL) tasks, it is prevalent to assign auxiliary tasks with a surrogate self-supervised loss so as to obtain more semantic representations and improve sample efficiency. However, abundant information in self-supe