Zobrazeno 1 - 10
of 1 877
pro vyhledávání: '"Wang, Yibin"'
Large Language Models (LLMs) often suffer from overconfidence during inference, particularly when adapted to downstream domain-specific tasks with limited data. Previous work addresses this issue by employing approximate Bayesian estimation after the
Externí odkaz:
http://arxiv.org/abs/2406.11675
Technology-driven precision livestock farming (PLF) empowers practitioners to monitor and analyze animal growth and health conditions for improved productivity and welfare. Computer vision (CV) is indispensable in PLF by using cameras and computer al
Externí odkaz:
http://arxiv.org/abs/2406.10628
Scene text synthesis involves rendering specified texts onto arbitrary images. Current methods typically formulate this task in an end-to-end manner but lack effective character-level guidance during training. Besides, their text encoders, pre-traine
Externí odkaz:
http://arxiv.org/abs/2405.14701
Autor:
Shi, Haizhou, Xu, Zihao, Wang, Hengyi, Qin, Weiyi, Wang, Wenyuan, Wang, Yibin, Wang, Zifeng, Ebrahimi, Sayna, Wang, Hao
The recent success of large language models (LLMs) trained on static, pre-collected, general datasets has sparked numerous research directions and applications. One such direction addresses the non-trivial challenge of integrating pre-trained LLMs in
Externí odkaz:
http://arxiv.org/abs/2404.16789
PrimeComposer: Faster Progressively Combined Diffusion for Image Composition with Attention Steering
Image composition involves seamlessly integrating given objects into a specific visual context. The current training-free methods rely on composing attention weights from several samplers to guide the generator. However, since these weights are deriv
Externí odkaz:
http://arxiv.org/abs/2403.05053
Autor:
Lee, Seungjae, Wang, Yibin, Etukuru, Haritheja, Kim, H. Jin, Shafiullah, Nur Muhammad Mahi, Pinto, Lerrel
Generative modeling of complex behaviors from labeled datasets has been a longstanding problem in decision making. Unlike language or image generation, decision making requires modeling actions - continuous-valued vectors that are multimodal in their
Externí odkaz:
http://arxiv.org/abs/2403.03181
Layout-to-image synthesis is an emerging technique in conditional image generation. It aims to generate complex scenes, where users require fine control over the layout of the objects in a scene. However, it remains challenging to control the object
Externí odkaz:
http://arxiv.org/abs/2311.10522
Current subject-driven image generation methods encounter significant challenges in person-centric image generation. The reason is that they learn the semantic scene and person generation by fine-tuning a common pre-trained diffusion, which involves
Externí odkaz:
http://arxiv.org/abs/2311.10329
In this paper, we study the problem of MOOC quality evaluation which is essential for improving the course materials, promoting students' learning efficiency, and benefiting user services. While achieving promising performances, current works still s
Externí odkaz:
http://arxiv.org/abs/2301.01593
While large-scale sequence modeling from offline data has led to impressive performance gains in natural language and image generation, directly translating such ideas to robotics has been challenging. One critical reason for this is that uncurated r
Externí odkaz:
http://arxiv.org/abs/2210.10047