Zobrazeno 1 - 10
of 869
pro vyhledávání: '"LIU, Shuming"'
Autor:
Liu, Shuming
The genome is the blueprint of human life, and it is crucial to understand its organization. The genome organization is hierarchical with different principles dominating at different scales. At the near-atomistic level, nucleosomes are organized as o
Externí odkaz:
https://hdl.handle.net/1721.1/157061
As a fundamental task in long-form video understanding, temporal action detection (TAD) aims to capture inherent temporal relations in untrimmed videos and identify candidate actions with precise boundaries. Over the years, various networks, includin
Externí odkaz:
http://arxiv.org/abs/2407.17792
Masked AutoEncoders (MAE) have emerged as a robust self-supervised framework, offering remarkable performance across a wide range of downstream tasks. To increase the difficulty of the pretext task and learn richer visual representations, existing wo
Externí odkaz:
http://arxiv.org/abs/2407.13036
Autor:
Zhao, Chen, Liu, Shuming, Mangalam, Karttikeya, Qian, Guocheng, Zohra, Fatimah, Alghannam, Abdulmohsen, Malik, Jitendra, Ghanem, Bernard
Publikováno v:
the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2024
Large pretrained models are increasingly crucial in modern computer vision tasks. These models are typically used in downstream tasks by end-to-end finetuning, which is highly memory-intensive for tasks with high-resolution data, e.g., video understa
Externí odkaz:
http://arxiv.org/abs/2401.04105
Recently, temporal action detection (TAD) has seen significant performance improvement with end-to-end training. However, due to the memory bottleneck, only models with limited scales and limited data volumes can afford end-to-end training, which ine
Externí odkaz:
http://arxiv.org/abs/2311.17241
Autor:
Zhuge, Mingchen, Liu, Haozhe, Faccio, Francesco, Ashley, Dylan R., Csordás, Róbert, Gopalakrishnan, Anand, Hamdi, Abdullah, Hammoud, Hasan Abed Al Kader, Herrmann, Vincent, Irie, Kazuki, Kirsch, Louis, Li, Bing, Li, Guohao, Liu, Shuming, Mai, Jinjie, Piękos, Piotr, Ramesh, Aditya, Schlag, Imanol, Shi, Weimin, Stanić, Aleksandar, Wang, Wenyi, Wang, Yuhui, Xu, Mengmeng, Fan, Deng-Ping, Ghanem, Bernard, Schmidhuber, Jürgen
Both Minsky's "society of mind" and Schmidhuber's "learning to think" inspire diverse societies of large multimodal neural networks (NNs) that solve problems by interviewing each other in a "mindstorm." Recent implementations of NN-based societies of
Externí odkaz:
http://arxiv.org/abs/2305.17066
Autor:
Xu, Mengmeng, Soldan, Mattia, Gao, Jialin, Liu, Shuming, Pérez-Rúa, Juan-Manuel, Ghanem, Bernard
Video activity localization aims at understanding the semantic content in long untrimmed videos and retrieving actions of interest. The retrieved action with its start and end locations can be used for highlight generation, temporal action detection,
Externí odkaz:
http://arxiv.org/abs/2304.02934
Autor:
Hammoud, Hasan Abed Al Kader, Liu, Shuming, Alkhrashi, Mohammed, AlBalawi, Fahad, Ghanem, Bernard
Deep neural networks (DNNs) are vulnerable to a class of attacks called "backdoor attacks", which create an association between a backdoor trigger and a target label the attacker is interested in exploiting. A backdoored DNN performs well on clean te
Externí odkaz:
http://arxiv.org/abs/2301.00986
Temporal action localization (TAL) requires long-form reasoning to predict actions of various durations and complex content. Given limited GPU memory, training TAL end to end (i.e., from videos to predictions) on long videos is a significant challeng
Externí odkaz:
http://arxiv.org/abs/2211.14053
Temporal action detection (TAD) with end-to-end training often suffers from the pain of huge demand for computing resources due to long video duration. In this work, we propose an efficient temporal action detector (ETAD) that can train directly from
Externí odkaz:
http://arxiv.org/abs/2205.07134