Výsledky vyhledávání

Report

When ControlNet Meets Inexplicit Masks: A Case Study of ControlNet on its Contour-following Ability

Autor: Xuan, Wenjie, Xu, Yufei, Zhao, Shanshan, Wang, Chaoyue, Liu, Juhua, Du, Bo, Tao, Dacheng

ControlNet excels at creating content that closely matches precise contours in user-provided masks. However, when these masks contain noise, as a frequent occurrence with non-expert users, the output would include unwanted artifacts. This paper first

Externí odkaz: http://arxiv.org/abs/2403.00467

Zobrazit plný text záznamu

Report

APTv2: Benchmarking Animal Pose Estimation and Tracking with a Large-scale Dataset and Beyond

Autor: Yang, Yuxiang, Deng, Yingqi, Xu, Yufei, Zhang, Jing

Animal Pose Estimation and Tracking (APT) is a critical task in detecting and monitoring the keypoints of animals across a series of video frames, which is essential for understanding animal behavior. Past works relating to animals have primarily foc

Externí odkaz: http://arxiv.org/abs/2312.15612

Zobrazit plný text záznamu

Report

HandRefiner: Refining Malformed Hands in Generated Images by Diffusion-based Conditional Inpainting

Autor: Lu, Wenquan, Xu, Yufei, Zhang, Jing, Wang, Chaoyue, Tao, Dacheng

Diffusion models have achieved remarkable success in generating realistic images but suffer from generating accurate human hands, such as incorrect finger counts or irregular shapes. This difficulty arises from the complex task of learning the physic

Externí odkaz: http://arxiv.org/abs/2311.17957

Zobrazit plný text záznamu

Report

Empowering Agrifood System with Artificial Intelligence: A Survey of the Progress, Challenges and Opportunities

Autor: Chen, Tao, Lv, Liang, Wang, Di, Zhang, Jing, Yang, Yue, Zhao, Zeyang, Wang, Chen, Guo, Xiaowei, Chen, Hao, Wang, Qingye, Xu, Yufei, Zhang, Qiming, Du, Bo, Zhang, Liangpei, Tao, Dacheng

With the world population rapidly increasing, transforming our agrifood systems to be more productive, efficient, safe, and sustainable is crucial to mitigate potential food shortages. Recently, artificial intelligence (AI) techniques such as deep le

Externí odkaz: http://arxiv.org/abs/2305.01899

Zobrazit plný text záznamu

Report

Vision Transformer with Quadrangle Attention

Autor: Zhang, Qiming, Zhang, Jing, Xu, Yufei, Tao, Dacheng

Window-based attention has become a popular choice in vision transformers due to its superior performance, lower computational complexity, and less memory footprint. However, the design of hand-crafted windows, which is data-agnostic, constrains the

Externí odkaz: http://arxiv.org/abs/2303.15105

Zobrazit plný text záznamu

Report

ViTPose++: Vision Transformer for Generic Body Pose Estimation

Autor: Xu, Yufei, Zhang, Jing, Zhang, Qiming, Tao, Dacheng

In this paper, we show the surprisingly good properties of plain vision transformers for body pose estimation from various aspects, namely simplicity in model structure, scalability in model size, flexibility in training paradigm, and transferability

Externí odkaz: http://arxiv.org/abs/2212.04246

Zobrazit plný text záznamu

Report

1st Workshop on Maritime Computer Vision (MaCVi) 2023: Challenge Results

Autor: Kiefer, Benjamin, Kristan, Matej, Perš, Janez, Žust, Lojze, Poiesi, Fabio, Andrade, Fabio Augusto de Alcantara, Bernardino, Alexandre, Dawkins, Matthew, Raitoharju, Jenni, Quan, Yitong, Atmaca, Adem, Höfer, Timon, Zhang, Qiming, Xu, Yufei, Zhang, Jing, Tao, Dacheng, Sommer, Lars, Spraul, Raphael, Zhao, Hangyue, Zhang, Hongpu, Zhao, Yanyun, Augustin, Jan Lukas, Jeon, Eui-ik, Lee, Impyeong, Zedda, Luca, Loddo, Andrea, Di Ruberto, Cecilia, Verma, Sagar, Gupta, Siddharth, Muralidhara, Shishir, Hegde, Niharika, Xing, Daitao, Evangeliou, Nikolaos, Tzes, Anthony, Bartl, Vojtěch, Špaňhel, Jakub, Herout, Adam, Bhowmik, Neelanjan, Breckon, Toby P., Kundargi, Shivanand, Anvekar, Tejas, Desai, Chaitra, Tabib, Ramesh Ashok, Mudengudi, Uma, Vats, Arpita, Song, Yang, Liu, Delong, Li, Yonglin, Li, Shuman, Tan, Chenhao, Lan, Long, Somers, Vladimir, De Vleeschouwer, Christophe, Alahi, Alexandre, Huang, Hsiang-Wei, Yang, Cheng-Yen, Hwang, Jenq-Neng, Kim, Pyong-Kun, Kim, Kwangju, Lee, Kyoungoh, Jiang, Shuai, Li, Haiwen, Ziqiang, Zheng, Vu, Tuan-Anh, Nguyen-Truong, Hai, Yeung, Sai-Kit, Jia, Zhuang, Yang, Sophia, Hsu, Chih-Chung, Hou, Xiu-Yu, Jhang, Yu-An, Yang, Simon, Yang, Mau-Tsuen

The 1$^{\text{st}}$ Workshop on Maritime Computer Vision (MaCVi) 2023 focused on maritime computer vision for Unmanned Aerial Vehicles (UAV) and Unmanned Surface Vehicle (USV), and organized several subchallenges in this domain: (i) UAV-based Maritim

Externí odkaz: http://arxiv.org/abs/2211.13508

Zobrazit plný text záznamu

Report

Rethinking Hierarchies in Pre-trained Plain Vision Transformer

Autor: Xu, Yufei, Zhang, Jing, Zhang, Qiming, Tao, Dacheng

Self-supervised pre-training vision transformer (ViT) via masked image modeling (MIM) has been proven very effective. However, customized algorithms should be carefully designed for the hierarchical ViTs, e.g., GreenMIM, instead of using the vanilla

Externí odkaz: http://arxiv.org/abs/2211.01785

Zobrazit plný text záznamu

Report

Advancing Plain Vision Transformer Towards Remote Sensing Foundation Model

Autor: Wang, Di, Zhang, Qiming, Xu, Yufei, Zhang, Jing, Du, Bo, Tao, Dacheng, Zhang, Liangpei

Large-scale vision foundation models have made significant progress in visual tasks on natural images, with vision transformers being the primary choice due to their good scalability and representation ability. However, large-scale models in remote s

Externí odkaz: http://arxiv.org/abs/2208.03987

Zobrazit plný text záznamu

Report

Transformer-based Context Condensation for Boosting Feature Pyramids in Object Detection

Autor: Chen, Zhe, Zhang, Jing, Xu, Yufei, Tao, Dacheng

Current object detectors typically have a feature pyramid (FP) module for multi-level feature fusion (MFF) which aims to mitigate the gap between features from different levels and form a comprehensive object representation to achieve better detectio

Externí odkaz: http://arxiv.org/abs/2207.06603

Zobrazit plný text záznamu

Plný text ve formátu HTML

Vyhledávací nástroje:

Upřesnit hledání