Zobrazeno 1 - 10
of 22 474
pro vyhledávání: '"Kim, Soo A."'
Autor:
Liu, Shaoteng, Wang, Tianyu, Wang, Jui-Hsien, Liu, Qing, Zhang, Zhifei, Lee, Joon-Young, Li, Yijun, Yu, Bei, Lin, Zhe, Kim, Soo Ye, Jia, Jiaya
Large-scale video generation models have the inherent ability to realistically model natural scenes. In this paper, we demonstrate that through a careful design of a generative video propagation framework, various video tasks can be addressed in a un
Externí odkaz:
http://arxiv.org/abs/2412.19761
Visual-Language Models (VLMs) have become a powerful tool for bridging the gap between visual and linguistic understanding. However, the conventional learning approaches for VLMs often suffer from limitations, such as the high resource requirements o
Externí odkaz:
http://arxiv.org/abs/2412.12940
Recent AI-based video editing has enabled users to edit videos through simple text prompts, significantly simplifying the editing process. However, recent zero-shot video editing techniques primarily focus on global or single-object edits, which can
Externí odkaz:
http://arxiv.org/abs/2412.12877
Autor:
Chen, Xi, Zhang, Zhifei, Zhang, He, Zhou, Yuqian, Kim, Soo Ye, Liu, Qing, Li, Yijun, Zhang, Jianming, Zhao, Nanxuan, Wang, Yilin, Ding, Hui, Lin, Zhe, Zhao, Hengshuang
We introduce UniReal, a unified framework designed to address various image generation and editing tasks. Existing solutions often vary by tasks, yet share fundamental principles: preserving consistency between inputs and outputs while capturing visu
Externí odkaz:
http://arxiv.org/abs/2412.07774
Autor:
Wang, Tianyu, Zhang, Jianming, Zheng, Haitian, Ding, Zhihong, Cohen, Scott, Lin, Zhe, Xiong, Wei, Fu, Chi-Wing, Figueroa, Luis, Kim, Soo Ye
Shadows are often under-considered or even ignored in image editing applications, limiting the realism of the edited results. In this paper, we introduce MetaShadow, a three-in-one versatile framework that enables detection, removal, and controllable
Externí odkaz:
http://arxiv.org/abs/2412.02635
Autor:
Song, Yizhi, He, Liu, Zhang, Zhifei, Kim, Soo Ye, Zhang, He, Xiong, Wei, Lin, Zhe, Price, Brian, Cohen, Scott, Zhang, Jianming, Aliaga, Daniel
Personalized image generation has emerged from the recent advancements in generative models. However, these generated personalized images often suffer from localized artifacts such as incorrect logos, reducing fidelity and fine-grained identity detai
Externí odkaz:
http://arxiv.org/abs/2412.00306
Autor:
Yang, Jinrui, Liu, Qing, Li, Yijun, Kim, Soo Ye, Pakhomov, Daniil, Ren, Mengwei, Zhang, Jianming, Lin, Zhe, Xie, Cihang, Zhou, Yuyin
Recent advancements in large generative models, particularly diffusion-based methods, have significantly enhanced the capabilities of image editing. However, achieving precise control over image composition tasks remains a challenge. Layered represen
Externí odkaz:
http://arxiv.org/abs/2411.17864
Autor:
Cai, Yuanhao, Zhang, He, Zhang, Kai, Liang, Yixun, Ren, Mengwei, Luan, Fujun, Liu, Qing, Kim, Soo Ye, Zhang, Jianming, Zhang, Zhifei, Zhou, Yuqian, Lin, Zhe, Yuille, Alan
Existing feed-forward image-to-3D methods mainly rely on 2D multi-view diffusion models that cannot guarantee 3D consistency. These methods easily collapse when changing the prompt view direction and mainly handle object-centric prompt images. In thi
Externí odkaz:
http://arxiv.org/abs/2411.14384
Autor:
Nguyen, Quang Vinh, Son, Vo Hoang Thanh, Hoang, Chau Truong Vinh, Nguyen, Duc Duy, Minh, Nhat Huy Nguyen, Kim, Soo-Hyung
Naturalistic driving action localization task aims to recognize and comprehend human behaviors and actions from video data captured during real-world driving scenarios. Previous studies have shown great action localization performance by applying a r
Externí odkaz:
http://arxiv.org/abs/2411.12525
Automatic polyp segmentation is crucial for effective diagnosis and treatment in colonoscopy images. Traditional methods encounter significant challenges in accurately delineating polyps due to limitations in feature representation and the handling o
Externí odkaz:
http://arxiv.org/abs/2410.01210