Výsledky vyhledávání

Report

Ego-VPA: Egocentric Video Understanding with Parameter-efficient Adaptation

Autor: Wu, Tz-Ying, Min, Kyle, Tripathi, Subarna, Vasconcelos, Nuno

Video understanding typically requires fine-tuning the large backbone when adapting to new domains. In this paper, we leverage the egocentric video foundation models (Ego-VFMs) based on video-language pre-training and propose a parameter-efficient ad

Externí odkaz: http://arxiv.org/abs/2407.19520

Zobrazit plný text záznamu

Report

Single-Stage Visual Relationship Learning using Conditional Queries

Autor: Desai, Alakh, Wu, Tz-Ying, Tripathi, Subarna, Vasconcelos, Nuno

Research in scene graph generation (SGG) usually considers two-stage models, that is, detecting a set of entities, followed by combining them and labeling all possible relationships. While showing promising results, the pipeline structure induces lar

Externí odkaz: http://arxiv.org/abs/2306.05689

Zobrazit plný text záznamu

Report

ProTeCt: Prompt Tuning for Taxonomic Open Set Classification

Autor: Wu, Tz-Ying, Ho, Chih-Hui, Vasconcelos, Nuno

Visual-language foundation models, like CLIP, learn generalized representations that enable zero-shot open-set classification. Few-shot adaptation methods, based on prompt tuning, have been shown to further improve performance on downstream datasets.

Externí odkaz: http://arxiv.org/abs/2306.02240

Zobrazit plný text záznamu

Report

Class-Incremental Learning with Strong Pre-trained Models

Autor: Wu, Tz-Ying, Swaminathan, Gurumurthy, Li, Zhizhong, Ravichandran, Avinash, Vasconcelos, Nuno, Bhotika, Rahul, Soatto, Stefano

Class-incremental learning (CIL) has been widely studied under the setting of starting from a small number of classes (base classes). Instead, we explore an understudied real-world setting of CIL that starts with a strong model pre-trained on a large

Externí odkaz: http://arxiv.org/abs/2204.03634

Zobrazit plný text záznamu

Report

Learning of Visual Relations: The Devil is in the Tails

Autor: Desai, Alakh, Wu, Tz-Ying, Tripathi, Subarna, Vasconcelos, Nuno

Significant effort has been recently devoted to modeling visual relations. This has mostly addressed the design of architectures, typically by adding parameters and increasing model complexity. However, visual relation learning is a long-tailed probl

Externí odkaz: http://arxiv.org/abs/2108.09668

Zobrazit plný text záznamu

Report

Solving Long-tailed Recognition with Deep Realistic Taxonomic Classifier

Autor: Wu, Tz-Ying, Morgado, Pedro, Wang, Pei, Ho, Chih-Hui, Vasconcelos, Nuno

Long-tail recognition tackles the natural non-uniformly distributed data in real-world scenarios. While modern classifiers perform well on populated classes, its performance degrades significantly on tail classes. Humans, however, are less affected b

Externí odkaz: http://arxiv.org/abs/2007.09898

Zobrazit plný text záznamu

Report

Exploit Clues from Views: Self-Supervised and Regularized Learning for Multiview Object Recognition

Autor: Ho, Chih-Hui, Liu, Bo, Wu, Tz-Ying, Vasconcelos, Nuno

Multiview recognition has been well studied in the literature and achieves decent performance in object recognition and retrieval task. However, most previous works rely on supervised learning and some impractical underlying assumptions, such as the

Externí odkaz: http://arxiv.org/abs/2003.12735

Zobrazit plný text záznamu

Report

Explainable Object-induced Action Decision for Autonomous Vehicles

Autor: Xu, Yiran, Yang, Xiaoyin, Gong, Lihang, Lin, Hsuan-Chu, Wu, Tz-Ying, Li, Yunsheng, Vasconcelos, Nuno

A new paradigm is proposed for autonomous driving. The new paradigm lies between the end-to-end and pipelined approaches, and is inspired by how humans solve the problem. While it relies on scene understanding, the latter only considers objects that

Externí odkaz: http://arxiv.org/abs/2003.09405

Zobrazit plný text záznamu

Dissertation/ Thesis

Reconstruction Research and Innovation Design of the Howard Tower Clocks

Autor: WU Tz-Cheng, 吳梓誠

105
The reconstruction design of ancient machinery requires reconstruction of various designs based on the existing literature and without violating the known scientific principles and technique skills in the subject time period. The purpose of

Externí odkaz: http://ndltd.ncl.edu.tw/handle/3g32w4

Zobrazit plný text záznamu

Report

Liquid Pouring Monitoring via Rich Sensory Inputs

Autor: Wu, Tz-Ying, Lin, Juan-Ting, Wang, Tsun-Hsuang, Hu, Chan-Wei, Niebles, Juan Carlos, Sun, Min

Humans have the amazing ability to perform very subtle manipulation task using a closed-loop control system with imprecise mechanics (i.e., our body parts) but rich sensory information (e.g., vision, tactile, etc.). In the closed-loop system, the abi

Externí odkaz: http://arxiv.org/abs/1808.01725

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání