Zobrazeno 1 - 10
of 948
pro vyhledávání: '"JAMAL, MUHAMMAD"'
We introduce VidLPRO, a novel video-language (VL) pre-training framework designed specifically for robotic and laparoscopic surgery. While existing surgical VL models primarily rely on contrastive learning, we propose a more comprehensive approach to
Externí odkaz:
http://arxiv.org/abs/2409.04732
In this paper, we propose a new progressive pre-training method for image understanding tasks which leverages RGB-D datasets. The method utilizes Multi-Modal Contrastive Masked Autoencoder and Denoising techniques. Our proposed approach consists of t
Externí odkaz:
http://arxiv.org/abs/2408.02245
Surgical scene understanding is a key technical component for enabling intelligent and context aware systems that can transform various aspects of surgical interventions. In this work, we focus on the semantic segmentation task, propose a simple yet
Externí odkaz:
http://arxiv.org/abs/2407.19714
Autor:
Hamoud, Idris, Jamal, Muhammad Abdullah, Srivastav, Vinkle, Mutter, Didier, Padoy, Nicolas, Mohareri, Omid
Surgical robotics holds much promise for improving patient safety and clinician experience in the Operating Room (OR). However, it also comes with new challenges, requiring strong team coordination and effective OR management. Automatic detection of
Externí odkaz:
http://arxiv.org/abs/2312.12250
We present a new pre-training strategy called M$^{3}$3D ($\underline{M}$ulti-$\underline{M}$odal $\underline{M}$asked $\underline{3D}$) built based on Multi-modal masked autoencoders that can leverage 3D priors and learned cross-modal representations
Externí odkaz:
http://arxiv.org/abs/2309.15313
There has been a growing interest in using deep learning models for processing long surgical videos, in order to automatically detect clinical/operational activities and extract metrics that can enable workflow efficiency tools and applications. Howe
Externí odkaz:
http://arxiv.org/abs/2305.11451
Data-driven approaches to assist operating room (OR) workflow analysis depend on large curated datasets that are time consuming and expensive to collect. On the other hand, we see a recent paradigm shift from supervised learning to self-supervised an
Externí odkaz:
http://arxiv.org/abs/2207.07894
Activity recognition in surgical videos is a key research area for developing next-generation devices and workflow monitoring systems. Since surgeries are long processes with highly-variable lengths, deep learning models used for surgical videos ofte
Externí odkaz:
http://arxiv.org/abs/2205.02805
Publikováno v:
In Heliyon 15 October 2024 10(19)
Autor:
Naeem, Muhammad Khizar Hayat, Wang, Yanqing, Ayub, Mariam, Akram, Awais, Jamal, Muhammad Sarwat
Publikováno v:
In Acta Psychologica September 2024 249