Zobrazeno 1 - 10
of 696
pro vyhledávání: '"Han, Shumin"'
Small CNN-based models usually require transferring knowledge from a large model before they are deployed in computationally resource-limited edge devices. Masked image modeling (MIM) methods achieve great success in various visual tasks but remain l
Externí odkaz:
http://arxiv.org/abs/2309.09571
Recently large-scale language-image models (e.g., text-guided diffusion models) have considerably improved the image generation capabilities to generate photorealistic images in various domains. Based on this success, current image editing methods us
Externí odkaz:
http://arxiv.org/abs/2305.04441
Pretraining on large-scale datasets can boost the performance of object detectors while the annotated datasets for object detection are hard to scale up due to the high labor cost. What we possess are numerous isolated filed-specific datasets, thus,
Externí odkaz:
http://arxiv.org/abs/2304.03580
Autor:
Duan, Xiaoyue, Kang, Guoliang, Wang, Runqi, Han, Shumin, Xue, Song, Wang, Tian, Zhang, Baochang
Robust Model-Agnostic Meta-Learning (MAML) is usually adopted to train a meta-model which may fast adapt to novel classes with only a few exemplars and meanwhile remain robust to adversarial attacks. The conventional solution for robust MAML is to in
Externí odkaz:
http://arxiv.org/abs/2211.15180
Autor:
Zhang, Xinyu, Chen, Jiahui, Yuan, Junkun, Chen, Qiang, Wang, Jian, Wang, Xiaodi, Han, Shumin, Chen, Xiaokang, Pi, Jimin, Yao, Kun, Han, Junyu, Ding, Errui, Wang, Jingdong
Masked image modeling (MIM) learns visual representation by masking and reconstructing image patches. Applying the reconstruction supervision on the CLIP representation has been proven effective for MIM. However, it is still under-explored how CLIP s
Externí odkaz:
http://arxiv.org/abs/2211.09799
Autor:
Han, Shumin1 (AUTHOR) hanshumin@lnpu.edu.cn, Shen, Kuixing1 (AUTHOR) wangchuang@lnpu.edu.cn, Shen, Derong2 (AUTHOR) shenderong@ise.neu.edu.cn, Wang, Chuang1 (AUTHOR)
Publikováno v:
Mathematics (2227-7390). Aug2024, Vol. 12 Issue 15, p2337. 19p.
Autor:
Wang, Yunhao, Sun, Huixin, Wang, Xiaodi, Zhang, Bin, Li, Chao, Xin, Ying, Zhang, Baochang, Ding, Errui, Han, Shumin
Vision Transformer and its variants have demonstrated great potential in various computer vision tasks. But conventional vision transformers often focus on global dependency at a coarse level, which suffer from a learning challenge on global relation
Externí odkaz:
http://arxiv.org/abs/2209.01620
Autor:
Chen, Xiaokang, Ding, Mingyu, Wang, Xiaodi, Xin, Ying, Mo, Shentong, Wang, Yunhao, Han, Shumin, Luo, Ping, Zeng, Gang, Wang, Jingdong
We present a novel masked image modeling (MIM) approach, context autoencoder (CAE), for self-supervised representation pretraining. We pretrain an encoder by making predictions in the encoded representation space. The pretraining tasks include two ta
Externí odkaz:
http://arxiv.org/abs/2202.03026
Autor:
Zhang, Anyi, Pan, Xiangyu, Zhang, Ning, Jia, Qiuyue, Wu, Guanjiu, Wang, Wenfeng, Han, Shumin, Li, Yuan, Zhang, Lu
Publikováno v:
In International Journal of Hydrogen Energy 11 October 2024 86:228-235