Zobrazeno 1 - 10
of 67
pro vyhledávání: '"Cai, Guanyu"'
Autor:
Cai, Guanyu, Ge, Yixiao, Zhang, Binjie, Wang, Alex Jinpeng, Yan, Rui, Lin, Xudong, Shan, Ying, He, Lianghua, Qie, Xiaohu, Wu, Jianping, Shou, Mike Zheng
Recent dominant methods for video-language pre-training (VLP) learn transferable representations from the raw pixels in an end-to-end manner to achieve advanced performance on downstream video-language retrieval. Despite the impressive results, VLP r
Externí odkaz:
http://arxiv.org/abs/2203.07720
Autor:
Wang, Alex Jinpeng, Ge, Yixiao, Yan, Rui, Ge, Yuying, Lin, Xudong, Cai, Guanyu, Wu, Jianping, Shan, Ying, Qie, Xiaohu, Shou, Mike Zheng
Mainstream Video-Language Pre-training models \cite{actbert,clipbert,violet} consist of three parts, a video encoder, a text encoder, and a video-text fusion Transformer. They pursue better performance via utilizing heavier unimodal encoders or multi
Externí odkaz:
http://arxiv.org/abs/2203.07303
Autor:
Yan, Rui, Shou, Mike Zheng, Ge, Yixiao, Wang, Alex Jinpeng, Lin, Xudong, Cai, Guanyu, Tang, Jinhui
Video-Text pre-training aims at learning transferable representations from large-scale video-text pairs via aligning the semantics between visual and textual information. State-of-the-art approaches extract visual features from raw pixels in an end-t
Externí odkaz:
http://arxiv.org/abs/2112.01194
Autor:
Wang, Alex Jinpeng, Ge, Yixiao, Cai, Guanyu, Yan, Rui, Lin, Xudong, Shan, Ying, Qie, Xiaohu, Shou, Mike Zheng
Recently, by introducing large-scale dataset and strong transformer network, video-language pre-training has shown great success especially for retrieval. Yet, existing video-language transformer models do not explicitly fine-grained semantic align.
Externí odkaz:
http://arxiv.org/abs/2112.00656
Autor:
Cai, Guanyu, Seguin, Johanne, Naillon, Thomas, Chanéac, Corinne, Corvis, Yohann, Scherman, Daniel, Mignet, Nathalie, Viana, Bruno, Richard, Cyrille
Publikováno v:
In Chemical Engineering Journal 15 June 2024 490
Autor:
Cai, Guanyu, He, Lianghua
Recent advances in unsupervised domain adaptation have seen considerable progress in semantic segmentation. Existing methods either align different domains with adversarial training or involve the self-learning that utilizes pseudo labels to conduct
Externí odkaz:
http://arxiv.org/abs/2105.12939
Autor:
Cai, Guanyu, Zhang, Jun, Jiang, Xinyang, Gong, Yifei, He, Lianghua, Yu, Fufu, Peng, Pai, Guo, Xiaowei, Huang, Feiyue, Sun, Xing
Text-based image retrieval has seen considerable progress in recent years. However, the performance of existing methods suffers in real life since the user is likely to provide an incomplete description of an image, which often leads to results fille
Externí odkaz:
http://arxiv.org/abs/2103.01654
Autor:
Gao, Chenyang, Cai, Guanyu, Jiang, Xinyang, Zheng, Feng, Zhang, Jun, Gong, Yifei, Peng, Pai, Guo, Xiaowei, Sun, Xing
Text-based person search aims at retrieving target person in an image gallery using a descriptive sentence of that person. It is very challenging since modal gap makes effectively extracting discriminative features more difficult. Moreover, the inter
Externí odkaz:
http://arxiv.org/abs/2101.03036
Autor:
Delgado, Teresa, Rytz, Daniel, Cai, Guanyu, Allix, Mathieu, Veron, Emmanuel, di Carlo, Ida, Viana, Bruno
Publikováno v:
In Ceramics International 15 December 2023 49(24) Part B:41031-41040
Typical adversarial-training-based unsupervised domain adaptation methods are vulnerable when the source and target datasets are highly-complex or exhibit a large discrepancy between their data distributions. Recently, several Lipschitz-constraint-ba
Externí odkaz:
http://arxiv.org/abs/1905.10748