Zobrazeno 1 - 10
of 27
pro vyhledávání: '"Xu, Zhengzhuo"'
Autor:
Du, Sinan, Zhang, Guosheng, Wang, Keyao, Wang, Yuanrui, Yue, Haixiao, Zhang, Gang, Ding, Errui, Wang, Jingdong, Xu, Zhengzhuo, Yuan, Chun
Parameter-efficient transfer learning (PETL) has become a promising paradigm for adapting large-scale vision foundation models to downstream tasks. Typical methods primarily leverage the intrinsic low rank property to make decomposition, learning tas
Externí odkaz:
http://arxiv.org/abs/2412.08341
Automatic chart understanding is crucial for content comprehension and document parsing. Multimodal large language models (MLLMs) have demonstrated remarkable capabilities in chart understanding through domain-specific alignment and fine-tuning. Howe
Externí odkaz:
http://arxiv.org/abs/2409.03277
Autor:
Liu, Ruikang, Bai, Haoli, Lin, Haokun, Li, Yuening, Gao, Han, Xu, Zhengzhuo, Hou, Lu, Yao, Jun, Yuan, Chun
Large language models (LLMs) excel in natural language processing but demand intensive computation. To mitigate this, various quantization methods have been explored, yet they compromise LLM performance. This paper unveils a previously overlooked typ
Externí odkaz:
http://arxiv.org/abs/2403.01241
Multimodal Large Language Models (MLLMs) have shown impressive capabilities in image understanding and generation. However, current benchmarks fail to accurately evaluate the chart comprehension of MLLMs due to limited chart types and inappropriate m
Externí odkaz:
http://arxiv.org/abs/2312.15915
Network Intrusion Detection (NID) works as a kernel technology for the security network environment, obtaining extensive research and application. Despite enormous efforts by researchers, NID still faces challenges in deploying on resource-constraine
Externí odkaz:
http://arxiv.org/abs/2307.10191
Real-world data usually suffers from severe class imbalance and long-tailed distributions, where minority classes are significantly underrepresented compared to the majority ones. Recent research prefers to utilize multi-expert architectures to mitig
Externí odkaz:
http://arxiv.org/abs/2305.03378
In the real world, data tends to follow long-tailed distributions w.r.t. class or attribution, motivating the challenging Long-Tailed Recognition (LTR) problem. In this paper, we revisit recent LTR methods with promising Vision Transformers (ViT). We
Externí odkaz:
http://arxiv.org/abs/2302.14284
The real-world data tends to be heavily imbalanced and severely skew the data-driven deep neural networks, which makes Long-Tailed Recognition (LTR) a massive challenging task. Existing LTR methods seldom train Vision Transformers (ViTs) with Long-Ta
Externí odkaz:
http://arxiv.org/abs/2212.02015
Image retrieval has become an increasingly appealing technique with broad multimedia application prospects, where deep hashing serves as the dominant branch towards low storage and efficient retrieval. In this paper, we carried out in-depth investiga
Externí odkaz:
http://arxiv.org/abs/2208.06866
Autor:
Chai, Zenghao, Zhang, Haoxian, Ren, Jing, Kang, Di, Xu, Zhengzhuo, Zhe, Xuefei, Yuan, Chun, Bao, Linchao
The evaluation of 3D face reconstruction results typically relies on a rigid shape alignment between the estimated 3D model and the ground-truth scan. We observe that aligning two shapes with different reference points can largely affect the evaluati
Externí odkaz:
http://arxiv.org/abs/2203.09729