Zobrazeno 1 - 10
of 125
pro vyhledávání: '"Hu, Wenze"'
Autor:
Lai, Zhengfeng, Saveris, Vasileios, Chen, Chen, Chen, Hong-You, Zhang, Haotian, Zhang, Bowen, Tebar, Juan Lao, Hu, Wenze, Gan, Zhe, Grasch, Peter, Cao, Meng, Yang, Yinfei
Recent advancements in multimodal models highlight the value of rewritten captions for improving performance, yet key challenges remain. For example, while synthetic captions often provide superior quality and image-text alignment, it is not clear wh
Externí odkaz:
http://arxiv.org/abs/2410.02740
Instruction-based image editing improves the controllability and flexibility of image manipulation via natural commands without elaborate descriptions or regional masks. However, human instructions are sometimes too brief for current methods to captu
Externí odkaz:
http://arxiv.org/abs/2309.17102
Autor:
Lin, Feng, Hu, Wenze, Wang, Yaowei, Tian, Yonghong, Lu, Guangming, Chen, Fanglin, Xu, Yong, Wang, Xiaoyu
Over the past few years, there has been growing interest in developing a broad, universal, and general-purpose computer vision system. Such systems have the potential to address a wide range of vision tasks simultaneously, without being limited to sp
Externí odkaz:
http://arxiv.org/abs/2212.09408
With the wide and deep adoption of deep learning models in real applications, there is an increasing need to model and learn the representations of the neural networks themselves. These models can be used to estimate attributes of different neural ne
Externí odkaz:
http://arxiv.org/abs/2211.08024
Currently, one main research line in designing a more efficient vision transformer is reducing the computational cost of self attention modules by adopting sparse attention or using local attention windows. In contrast, we propose a different approac
Externí odkaz:
http://arxiv.org/abs/2211.07198
Transformers have shown great potential in various computer vision tasks. By borrowing design concepts from transformers, many studies revolutionized CNNs and showed remarkable results. This paper falls in this line of studies. Specifically, we propo
Externí odkaz:
http://arxiv.org/abs/2211.07157
Transformer models have made tremendous progress in various fields in recent years. In the field of computer vision, vision transformers (ViTs) also become strong alternatives to convolutional neural networks (ConvNets), yet they have not been able t
Externí odkaz:
http://arxiv.org/abs/2210.04020
Autor:
Feng, Zhanpeng, Zhang, Shiliang, Takezoe, Rinyoichi, Hu, Wenze, Chandraker, Manmohan, Li, Li-Jia, Narayanan, Vijay K., Wang, Xiaoyu
Active learning is an important technology for automated machine learning systems. In contrast to Neural Architecture Search (NAS) which aims at automating neural network architecture design, active learning aims at automating training data selection
Externí odkaz:
http://arxiv.org/abs/2207.13339
Automated machine learning systems for non-experts could be critical for industries to adopt artificial intelligence to their own applications. This paper detailed the engineering system implementation of an automated machine learning system called Y
Externí odkaz:
http://arxiv.org/abs/2203.15784
Recently, vision transformers started to show impressive results which outperform large convolution based models significantly. However, in the area of small models for mobile or resource constrained devices, ConvNet still has its own advantages in b
Externí odkaz:
http://arxiv.org/abs/2203.03952