Výsledky vyhledávání - "LIU, Shuming"

Dissertation/ Thesis

Understanding Chromatin Organization and Dynamics with Coarse-Grained Modeling

Autor: Liu, Shuming

The genome is the blueprint of human life, and it is crucial to understand its organization. The genome organization is hierarchical with different principles dominating at different scales. At the near-atomistic level, nucleosomes are organized as o

Externí odkaz: https://hdl.handle.net/1721.1/157061

Zobrazit plný text záznamu

Report

Harnessing Temporal Causality for Advanced Temporal Action Detection

Autor: Liu, Shuming, Sui, Lin, Zhang, Chen-Lin, Mu, Fangzhou, Zhao, Chen, Ghanem, Bernard

As a fundamental task in long-form video understanding, temporal action detection (TAD) aims to capture inherent temporal relations in untrimmed videos and identify candidate actions with precise boundaries. Over the years, various networks, includin

Externí odkaz: http://arxiv.org/abs/2407.17792

Zobrazit plný text záznamu

Report

ColorMAE: Exploring data-independent masking strategies in Masked AutoEncoders

Autor: Hinojosa, Carlos, Liu, Shuming, Ghanem, Bernard

Masked AutoEncoders (MAE) have emerged as a robust self-supervised framework, offering remarkable performance across a wide range of downstream tasks. To increase the difficulty of the pretext task and learn richer visual representations, existing wo

Externí odkaz: http://arxiv.org/abs/2407.13036

Zobrazit plný text záznamu

Report

Dr$^2$Net: Dynamic Reversible Dual-Residual Networks for Memory-Efficient Finetuning

Autor: Zhao, Chen, Liu, Shuming, Mangalam, Karttikeya, Qian, Guocheng, Zohra, Fatimah, Alghannam, Abdulmohsen, Malik, Jitendra, Ghanem, Bernard

Publikováno v: the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2024

Large pretrained models are increasingly crucial in modern computer vision tasks. These models are typically used in downstream tasks by end-to-end finetuning, which is highly memory-intensive for tasks with high-resolution data, e.g., video understa

Externí odkaz: http://arxiv.org/abs/2401.04105

Zobrazit plný text záznamu

Report

End-to-End Temporal Action Detection with 1B Parameters Across 1000 Frames

Autor: Liu, Shuming, Zhang, Chen-Lin, Zhao, Chen, Ghanem, Bernard

Recently, temporal action detection (TAD) has seen significant performance improvement with end-to-end training. However, due to the memory bottleneck, only models with limited scales and limited data volumes can afford end-to-end training, which ine

Externí odkaz: http://arxiv.org/abs/2311.17241

Zobrazit plný text záznamu

Report

Mindstorms in Natural Language-Based Societies of Mind

Both Minsky's "society of mind" and Schmidhuber's "learning to think" inspire diverse societies of large multimodal neural networks (NNs) that solve problems by interviewing each other in a "mindstorm." Recent implementations of NN-based societies of

Externí odkaz: http://arxiv.org/abs/2305.17066

Zobrazit plný text záznamu

Report

Boundary-Denoising for Video Activity Localization

Autor: Xu, Mengmeng, Soldan, Mattia, Gao, Jialin, Liu, Shuming, Pérez-Rúa, Juan-Manuel, Ghanem, Bernard

Video activity localization aims at understanding the semantic content in long untrimmed videos and retrieving actions of interest. The retrieved action with its start and end locations can be used for highlight generation, temporal action detection,

Externí odkaz: http://arxiv.org/abs/2304.02934

Zobrazit plný text záznamu

Report

Look, Listen, and Attack: Backdoor Attacks Against Video Action Recognition

Autor: Hammoud, Hasan Abed Al Kader, Liu, Shuming, Alkhrashi, Mohammed, AlBalawi, Fahad, Ghanem, Bernard

Deep neural networks (DNNs) are vulnerable to a class of attacks called "backdoor attacks", which create an association between a backdoor trigger and a target label the attacker is interested in exploiting. A backdoored DNN performs well on clean te

Externí odkaz: http://arxiv.org/abs/2301.00986

Zobrazit plný text záznamu

Report

Re^2TAL: Rewiring Pretrained Video Backbones for Reversible Temporal Action Localization

Autor: Zhao, Chen, Liu, Shuming, Mangalam, Karttikeya, Ghanem, Bernard

Temporal action localization (TAL) requires long-form reasoning to predict actions of various durations and complex content. Given limited GPU memory, training TAL end to end (i.e., from videos to predictions) on long videos is a significant challeng

Externí odkaz: http://arxiv.org/abs/2211.14053

Zobrazit plný text záznamu

Report

ETAD: Training Action Detection End to End on a Laptop

Autor: Liu, Shuming, Xu, Mengmeng, Zhao, Chen, Zhao, Xu, Ghanem, Bernard

Temporal action detection (TAD) with end-to-end training often suffers from the pain of huge demand for computing resources due to long video duration. In this work, we propose an efficient temporal action detector (ETAD) that can train directly from

Externí odkaz: http://arxiv.org/abs/2205.07134

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání