Zobrazeno 1 - 10
of 1 242
pro vyhledávání: '"Li Yunsong"'
Visual language models like Contrastive Language-Image Pretraining (CLIP) have shown impressive performance in analyzing natural images with language information. However, these models often encounter challenges when applied to specialized domains su
Externí odkaz:
http://arxiv.org/abs/2412.07119
Autor:
Guo, Wenjin, Liu, Donglai, Xie, Weiying, Li, Yunsong, Ning, Xuefei, Meng, Zihan, Zeng, Shulin, Lei, Jie, Fang, Zhenman, Wang, Yu
Neural network training is a memory- and compute-intensive task. Quantization, which enables low-bitwidth formats in training, can significantly mitigate the workload. To reduce quantization error, recent methods have developed new data formats and a
Externí odkaz:
http://arxiv.org/abs/2411.10948
Autor:
Xu, Kepeng, Xu, Li, He, Gang, Zhang, Zhiqiang, Yu, Wenxin, Wang, Shihao, Zhou, Dajiang, Li, Yunsong
The rise of HDR-WCG display devices has highlighted the need to convert SDRTV to HDRTV, as most video sources are still in SDR. Existing methods primarily focus on designing neural networks to learn a single-style mapping from SDRTV to HDRTV. However
Externí odkaz:
http://arxiv.org/abs/2411.10775
Recent advances in neural camera imaging pipelines have demonstrated notable progress. Nevertheless, the real-world imaging pipeline still faces challenges including the lack of joint optimization in system components, computational redundancies, and
Externí odkaz:
http://arxiv.org/abs/2411.10773
Autor:
Dong, Shuhan, Li, Yunsong, Xie, Weiying, Zhang, Jiaqing, Tian, Jiayuan, Yang, Danian, Lei, Jie
Multimodal object detection leverages diverse modal information to enhance the accuracy and robustness of detectors. By learning long-term dependencies, Transformer can effectively integrate multimodal features in the feature extraction stage, which
Externí odkaz:
http://arxiv.org/abs/2410.11358
Autor:
Yang, Sheng, Wu, Yurong, Gao, Yan, Zhou, Zineng, Zhu, Bin Benjamin, Sun, Xiaodi, Lou, Jian-Guang, Ding, Zhiming, Hu, Anbang, Fang, Yuan, Li, Yunsong, Chen, Junyan, Yang, Linjun
Prompt engineering is very important to enhance the performance of large language models (LLMs). When dealing with complex issues, prompt engineers tend to distill multiple patterns from examples and inject relevant solutions to optimize the prompts,
Externí odkaz:
http://arxiv.org/abs/2410.08696
Recent advancements in deep learning have greatly advanced the field of infrared small object detection (IRSTD). Despite their remarkable success, a notable gap persists between these IRSTD methods and generic segmentation approaches in natural image
Externí odkaz:
http://arxiv.org/abs/2409.04714
Autor:
Li, Daixun, Xie, Weiying, Cao, Mingxiang, Wang, Yunke, Zhang, Jiaqing, Li, Yunsong, Fang, Leyuan, Xu, Chang
Multimodal image fusion and segmentation enhance scene understanding in autonomous driving by integrating data from various sensors. However, current models struggle to efficiently segment densely packed elements in such scenes, due to the absence of
Externí odkaz:
http://arxiv.org/abs/2408.13980
Federated learning (FL) is a decentralized approach, enabling multiple participants to collaboratively train a model while ensuring the protection of data privacy. The transmission of updates from numerous edge clusters to the server creates a signif
Externí odkaz:
http://arxiv.org/abs/2408.08977
The rapid development of multimedia has provided a large amount of data with different distributions for visual tasks, forming different domains. Federated Learning (FL) can efficiently use this diverse data distributed on different client media in a
Externí odkaz:
http://arxiv.org/abs/2407.19174