Výsledky vyhledávání

Report

Window-based Channel Attention for Wavelet-enhanced Learned Image Compression

Autor: Xu, Heng, Hai, Bowen, Tang, Yushun, He, Zhihai

Learned Image Compression (LIC) models have achieved superior rate-distortion performance than traditional codecs. Existing LIC models use CNN, Transformer, or Mixed CNN-Transformer as basic blocks. However, limited by the shifted window attention, S

Externí odkaz: http://arxiv.org/abs/2409.14090

Zobrazit plný text záznamu

Report

Learning to Adapt CLIP for Few-Shot Monocular Depth Estimation

Autor: Hu, Xueting, Zhang, Ce, Zhang, Yi, Hai, Bowen, Yu, Ke, He, Zhihai

Pre-trained Vision-Language Models (VLMs), such as CLIP, have shown enhanced performance across a range of tasks that involve the integration of visual and linguistic modalities. When CLIP is used for depth estimation tasks, the patches, divided from

Externí odkaz: http://arxiv.org/abs/2311.01034

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání