Výsledky vyhledávání

Report

TAPVid-3D: A Benchmark for Tracking Any Point in 3D

Autor: Koppula, Skanda, Rocco, Ignacio, Yang, Yi, Heyward, Joe, Carreira, João, Zisserman, Andrew, Brostow, Gabriel, Doersch, Carl

We introduce a new benchmark, TAPVid-3D, for evaluating the task of long-range Tracking Any Point in 3D (TAP-3D). While point tracking in two dimensions (TAP) has many benchmarks measuring performance on real-world videos, such as TAPVid-DAVIS, three

Externí odkaz: http://arxiv.org/abs/2407.05921

Zobrazit plný text záznamu

Report

Memory Consolidation Enables Long-Context Video Understanding

Autor: Balažević, Ivana, Shi, Yuge, Papalampidi, Pinelopi, Chaabouni, Rahma, Koppula, Skanda, Hénaff, Olivier J.

Most transformer-based video encoders are limited to short temporal contexts due to their quadratic complexity. While various attempts have been made to extend this context, this has often come at the cost of both conceptual and computational complex

Externí odkaz: http://arxiv.org/abs/2402.05861

Zobrazit plný text záznamu

Report

BootsTAP: Bootstrapped Training for Tracking-Any-Point

Autor: Doersch, Carl, Luc, Pauline, Yang, Yi, Gokay, Dilara, Koppula, Skanda, Gupta, Ankush, Heyward, Joseph, Rocco, Ignacio, Goroshin, Ross, Carreira, João, Zisserman, Andrew

To endow models with greater understanding of physics and motion, it is useful to enable them to perceive how solid surfaces move and deform in real scenes. This can be formalized as Tracking-Any-Point (TAP), which requires the algorithm to track any

Externí odkaz: http://arxiv.org/abs/2402.00847

Zobrazit plný text záznamu

Report

Quantum Polynomial Hierarchies: Karp-Lipton, error reduction, and lower bounds

Autor: Agarwal, Avantika, Gharibian, Sevag, Koppula, Venkata, Rudolph, Dorian

Publikováno v: 49th International Symposium on Mathematical Foundations of Computer Science (MFCS 2024)

The Polynomial-Time Hierarchy ($\mathsf{PH}$) is a staple of classical complexity theory, with applications spanning randomized computation to circuit lower bounds to ''quantum advantage'' analyses for near-term quantum computers. Quantumly, however,

Externí odkaz: http://arxiv.org/abs/2401.01633

Zobrazit plný text záznamu

Report

A Simple Recipe for Contrastively Pre-training Video-First Encoders Beyond 16 Frames

Autor: Papalampidi, Pinelopi, Koppula, Skanda, Pathak, Shreya, Chiu, Justin, Heyward, Joe, Patraucean, Viorica, Shen, Jiajun, Miech, Antoine, Zisserman, Andrew, Nematzadeh, Aida

Understanding long, real-world videos requires modeling of long-range visual dependencies. To this end, we explore video-first architectures, building on the common paradigm of transferring large-scale, image--text models to video via shallow tempora

Externí odkaz: http://arxiv.org/abs/2312.07395

Zobrazit plný text záznamu

Akademický článek

Multimodality Imaging in the Diagnosis and Staging of Gestational Choriocarcinoma

Autor: Anitha Mandava, Veeraiah Koppula, Meghana Kandati, Arvind K. Reddy, Senthil J. Rajappa, T. S. Rao

Publikováno v: Indian Journal of Radiology and Imaging, Vol 35, Iss 01, Pp 148-158 (2025)

Choriocarcinoma is an uncommon, highly invasive malignancy arising from the placental trophoblastic tissue. Though staging is clinical, imaging has significant role in the diagnosis, staging, prognostic risk scoring, and management of choriocarcinoma

Externí odkaz: https://doaj.org/article/42d9387898774cd1bdcdaf4ec60d11df

Zobrazit plný text záznamu

Plný text ve formátu HTML

Akademický článek

Multimodality Imaging in the Diagnosis and Staging of Gestational Choriocarcinoma.

Autor: Mandava, Anitha¹ (AUTHOR) kanisri@gmail.com, Koppula, Veeraiah¹ (AUTHOR), Kandati, Meghana¹ (AUTHOR), Reddy, Arvind K.¹ (AUTHOR), Rajappa, Senthil J.² (AUTHOR), Rao, T. S.³ (AUTHOR)

Publikováno v: Indian Journal of Radiology & Imaging. Jan2025, Vol. 35 Issue 1, p148-158. 11p.

Zobrazit plný text záznamu

Plný text ve formátu HTML

Report

Corpus Synthesis for Zero-shot ASR domain Adaptation using Large Language Models

Autor: Su, Hsuan, Hu, Ting-Yao, Koppula, Hema Swetha, Vemulapalli, Raviteja, Chang, Jen-Hao Rick, Yang, Karren, Mantena, Gautam Varma, Tuzel, Oncel

While Automatic Speech Recognition (ASR) systems are widely used in many real-world applications, they often do not generalize well to new domains and need to be finetuned on data from these domains. However, target-domain data usually are not readil

Externí odkaz: http://arxiv.org/abs/2309.10707

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání