Zobrazeno 1 - 10
of 382
pro vyhledávání: '"Å. Grauman"'
Publikováno v:
BMC Medical Informatics and Decision Making, Vol 23, Iss 1, Pp 1-13 (2023)
Abstract Background The implementation of precision medicine is likely to have a huge impact on clinical cancer care, while the doctor-patient relationship is a crucial aspect of cancer care that needs to be preserved. This systematic review aimed to
Externí odkaz:
https://doaj.org/article/908625fbb5604742abae3acda2d8deb4
Given a multi-view video, which viewpoint is most informative for a human observer? Existing methods rely on heuristics or expensive ``best-view" supervision to answer this question, limiting their applicability. We propose a weakly supervised approa
Externí odkaz:
http://arxiv.org/abs/2411.08753
Autor:
Lai, Bolin, Toyer, Sam, Nagarajan, Tushar, Girdhar, Rohit, Zha, Shengxin, Rehg, James M., Kitani, Kris, Grauman, Kristen, Desai, Ruta, Liu, Miao
Predicting future human behavior is an increasingly popular topic in computer vision, driven by the interest in applications such as autonomous vehicles, digital assistants and human-robot interactions. The literature on behavior prediction spans var
Externí odkaz:
http://arxiv.org/abs/2410.14045
Feedback is essential for learning a new skill or improving one's current skill-level. However, current methods for skill-assessment from video only provide scores or compare demonstrations, leaving the burden of knowing what to do differently on the
Externí odkaz:
http://arxiv.org/abs/2408.00672
Autor:
Chen, Changan, Peng, Puyuan, Baid, Ami, Xue, Zihui, Hsu, Wei-Ning, Harwath, David, Grauman, Kristen
Generating realistic audio for human actions is important for many applications, such as creating sound effects for films or virtual reality games. Existing approaches implicitly assume total correspondence between the video and audio during training
Externí odkaz:
http://arxiv.org/abs/2406.09272
We study the problem of precisely swapping objects in videos, with a focus on those interacted with by hands, given one user-provided reference object image. Despite the great advancements that diffusion models have made in video editing recently, th
Externí odkaz:
http://arxiv.org/abs/2406.07754
Sim2real transfer has received increasing attention lately due to the success of learning robotic tasks in simulation end-to-end. While there has been a lot of progress in transferring vision-based navigation policies, the existing sim2real strategy
Externí odkaz:
http://arxiv.org/abs/2405.02821
An environment acoustic model represents how sound is transformed by the physical characteristics of an indoor environment, for any given source/receiver location. Traditional methods for constructing acoustic models involve expensive and time-consum
Externí odkaz:
http://arxiv.org/abs/2404.16216
We propose a novel self-supervised embedding to learn how actions sound from narrated in-the-wild egocentric videos. Whereas existing methods rely on curated data with known audio-visual correspondence, our multimodal contrastive-consensus coding (MC
Externí odkaz:
http://arxiv.org/abs/2404.05206
We investigate exocentric-to-egocentric cross-view translation, which aims to generate a first-person (egocentric) view of an actor based on a video recording that captures the actor from a third-person (exocentric) perspective. To this end, we propo
Externí odkaz:
http://arxiv.org/abs/2403.06351