Výsledky vyhledávání

Report

SwinStyleformer is a favorable choice for image inversion

Autor: Mao, Jiawei, Zhao, Guangyi, Yin, Xuesong, Chang, Yuanqi

This paper proposes the first pure Transformer structure inversion network called SwinStyleformer, which can compensate for the shortcomings of the CNNs inversion framework by handling long-range dependencies and learning the global structure of obje

Externí odkaz: http://arxiv.org/abs/2406.13153

Zobrazit plný text záznamu

Report

Restorer: Removing Multi-Degradation with All-Axis Attention and Prompt Guidance

Autor: Mao, Jiawei, Wu, Juncheng, Zhou, Yuyin, Yin, Xuesong, Chang, Yuanqi

There are many excellent solutions in image restoration.However, most methods require on training separate models to restore images with different types of degradation.Although existing all-in-one models effectively address multiple types of degradat

Externí odkaz: http://arxiv.org/abs/2406.12587

Zobrazit plný text záznamu

Report

Medical supervised masked autoencoders: Crafting a better masking strategy and efficient fine-tuning schedule for medical image classification

Autor: Mao, Jiawei, Guo, Shujian, Chang, Yuanqi, Yin, Xuesong, Nie, Binling

Masked autoencoders (MAEs) have displayed significant potential in the classification and semantic segmentation of medical images in the last year. Due to the high similarity of human tissues, even slight changes in medical images may represent disea

Externí odkaz: http://arxiv.org/abs/2305.05871

Zobrazit plný text záznamu

Report

Star-Net: Improving Single Image Desnowing Model With More Efficient Connection and Diverse Feature Interaction

Autor: Mao, Jiawei, Chang, Yuanqi, Yin, Xuesong, Nie, Binling

Compared to other severe weather image restoration tasks, single image desnowing is a more challenging task. This is mainly due to the diversity and irregularity of snow shape, which makes it extremely difficult to restore images in snowy scenes. Mor

Externí odkaz: http://arxiv.org/abs/2303.09988

Zobrazit plný text záznamu

Report

POSTER++: A simpler and stronger facial expression recognition network

Autor: Mao, Jiawei, Xu, Rui, Yin, Xuesong, Chang, Yuanqi, Nie, Binling, Huang, Aibin

Facial expression recognition (FER) plays an important role in a variety of real-world applications such as human-computer interaction. POSTER achieves the state-of-the-art (SOTA) performance in FER by effectively combining facial landmark and image

Externí odkaz: http://arxiv.org/abs/2301.12149

Zobrazit plný text záznamu

Report

Masked autoencoders are effective solution to transformer data-hungry

Autor: Mao, Jiawei, Zhou, Honggu, Yin, Xuesong, Xu, Yuanqi Chang. Binling Nie. Rui

Vision Transformers (ViTs) outperforms convolutional neural networks (CNNs) in several vision tasks with its global modeling capabilities. However, ViT lacks the inductive bias inherent to convolution making it require a large amount of data for trai

Externí odkaz: http://arxiv.org/abs/2212.05677

Zobrazit plný text záznamu

Report

More comprehensive facial inversion for more effective expression recognition

Autor: Mao, Jiawei, Zhao, Guangyi, Chang, Yuanqi, Yin, Xuesong, Peng, Xiaogang, Xu, Rui

Facial expression recognition (FER) plays a significant role in the ubiquitous application of computer vision. We revisit this problem with a new perspective on whether it can acquire useful representations that improve FER performance in the image g

Externí odkaz: http://arxiv.org/abs/2211.13564

Zobrazit plný text záznamu

Report

PointCMC: Cross-Modal Multi-Scale Correspondences Learning for Point Cloud Understanding

Autor: Zhou, Honggu, Peng, Xiaogang, Mao, Jiawei, Wu, Zizhao, Zeng, Ming

Some self-supervised cross-modal learning approaches have recently demonstrated the potential of image signals for enhancing point cloud representation. However, it remains a question on how to directly model cross-modal local and global corresponden

Externí odkaz: http://arxiv.org/abs/2211.12032

Zobrazit plný text záznamu

Report

Token Transformer: Can class token help window-based transformer build better long-range interactions?

Autor: Mao, Jiawei, Chang, Yuanqi, Yin, Xuesong

Compared with the vanilla transformer, the window-based transformer offers a better trade-off between accuracy and efficiency. Although the window-based transformer has made great progress, its long-range modeling capabilities are limited due to the

Externí odkaz: http://arxiv.org/abs/2211.06083

Zobrazit plný text záznamu

Report

Improvements to Self-Supervised Representation Learning for Masked Image Modeling

Autor: Mao, Jiawei, Yin, Xuesong, Chang, Yuanqi, Zhou, Honggu

This paper explores improvements to the masked image modeling (MIM) paradigm. The MIM paradigm enables the model to learn the main object features of the image by masking the input image and predicting the masked part by the unmasked part. We found t

Externí odkaz: http://arxiv.org/abs/2205.10546

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání