Výsledky vyhledávání

Report

Two Optimizers Are Better Than One: LLM Catalyst Empowers Gradient-Based Optimization for Prompt Tuning

Autor: Guo, Zixian, Liu, Ming, Ji, Zhilong, Bai, Jinfeng, Guo, Yiwen, Zuo, Wangmeng

Learning a skill generally relies on both practical experience by doer and insightful high-level guidance by instructor. Will this strategy also work well for solving complex non-convex optimization problems? Here, a common gradient-based optimizer a

Externí odkaz: http://arxiv.org/abs/2405.19732

Zobrazit plný text záznamu

Report

MuMath-Code: Combining Tool-Use Large Language Models with Multi-perspective Data Augmentation for Mathematical Reasoning

Autor: Yin, Shuo, You, Weihao, Ji, Zhilong, Zhong, Guoqiang, Bai, Jinfeng

The tool-use Large Language Models (LLMs) that integrate with external Python interpreters have significantly enhanced mathematical reasoning capabilities for open-source LLMs, while tool-free methods chose another track: augmenting math reasoning da

Externí odkaz: http://arxiv.org/abs/2405.07551

Zobrazit plný text záznamu

Report

MasterWeaver: Taming Editability and Face Identity for Personalized Text-to-Image Generation

Autor: Wei, Yuxiang, Ji, Zhilong, Bai, Jinfeng, Zhang, Hongzhi, Zhang, Lei, Zuo, Wangmeng

Text-to-image (T2I) diffusion models have shown significant success in personalized text-to-image generation, which aims to generate novel images with human identities indicated by the reference images. Despite promising identity fidelity has been ac

Externí odkaz: http://arxiv.org/abs/2405.05806

Zobrazit plný text záznamu

Report

HPNet: Dynamic Trajectory Forecasting with Historical Prediction Attention

Autor: Tang, Xiaolong, Kan, Meina, Shan, Shiguang, Ji, Zhilong, Bai, Jinfeng, Chen, Xilin

Predicting the trajectories of road agents is essential for autonomous driving systems. The recent mainstream methods follow a static paradigm, which predicts the future trajectory by using a fixed duration of historical frames. These methods make th

Externí odkaz: http://arxiv.org/abs/2404.06351

Zobrazit plný text záznamu

Report

Black-Box Tuning of Vision-Language Models with Effective Gradient Approximation

Autor: Guo, Zixian, Wei, Yuxiang, Liu, Ming, Ji, Zhilong, Bai, Jinfeng, Guo, Yiwen, Zuo, Wangmeng

Parameter-efficient fine-tuning (PEFT) methods have provided an effective way for adapting large vision-language models to specific tasks or scenarios. Typically, they learn a very small scale of parameters for pre-trained models in a white-box formu

Externí odkaz: http://arxiv.org/abs/2312.15901

Zobrazit plný text záznamu

Report

Decoupled Textual Embeddings for Customized Image Generation

Autor: Cai, Yufei, Wei, Yuxiang, Ji, Zhilong, Bai, Jinfeng, Han, Hu, Zuo, Wangmeng

Customized text-to-image generation, which aims to learn user-specified concepts with a few images, has drawn significant attention recently. However, existing methods usually suffer from overfitting issues and entangle the subject-unrelated informat

Externí odkaz: http://arxiv.org/abs/2312.11826

Zobrazit plný text záznamu

Report

Semantic Graph Representation Learning for Handwritten Mathematical Expression Recognition

Autor: Liu, Zhuang, Yuan, Ye, Ji, Zhilong, Bai, Jingfeng, Bai, Xiang

Handwritten mathematical expression recognition (HMER) has attracted extensive attention recently. However, current methods cannot explicitly study the interactions between different symbols, which may fail when faced similar symbols. To alleviate th

Externí odkaz: http://arxiv.org/abs/2308.10493

Zobrazit plný text záznamu

Report

Patch Is Not All You Need

Autor: Li, Changzhen, Zhang, Jie, Wei, Yang, Ji, Zhilong, Bai, Jinfeng, Shan, Shiguang

Vision Transformers have achieved great success in computer visions, delivering exceptional performance across various tasks. However, their inherent reliance on sequential input enforces the manual partitioning of images into patch sequences, which

Externí odkaz: http://arxiv.org/abs/2308.10729

Zobrazit plný text záznamu

Report

LRANet: Towards Accurate and Efficient Scene Text Detection with Low-Rank Approximation Network

Autor: Su, Yuchen, Chen, Zhineng, Shao, Zhiwen, Du, Yuning, Ji, Zhilong, Bai, Jinfeng, Zhou, Yong, Jiang, Yu-Gang

Recently, regression-based methods, which predict parameterized text shapes for text localization, have gained popularity in scene text detection. However, the existing parameterized text shape methods still have limitations in modeling arbitrary-sha

Externí odkaz: http://arxiv.org/abs/2306.15142

Zobrazit plný text záznamu

Report

Inferring and Leveraging Parts from Object Shape for Improving Semantic Image Synthesis

Autor: Wei, Yuxiang, Ji, Zhilong, Wu, Xiaohe, Bai, Jinfeng, Zhang, Lei, Zuo, Wangmeng

Despite the progress in semantic image synthesis, it remains a challenging problem to generate photo-realistic parts from input semantic map. Integrating part segmentation map can undoubtedly benefit image synthesis, but is bothersome and inconvenien

Externí odkaz: http://arxiv.org/abs/2305.19547

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání