Zobrazeno 1 - 2
of 2
pro vyhledávání: '"Hai, Bowen"'
Learned Image Compression (LIC) models have achieved superior rate-distortion performance than traditional codecs. Existing LIC models use CNN, Transformer, or Mixed CNN-Transformer as basic blocks. However, limited by the shifted window attention, S
Externí odkaz:
http://arxiv.org/abs/2409.14090
Pre-trained Vision-Language Models (VLMs), such as CLIP, have shown enhanced performance across a range of tasks that involve the integration of visual and linguistic modalities. When CLIP is used for depth estimation tasks, the patches, divided from
Externí odkaz:
http://arxiv.org/abs/2311.01034