Zobrazeno 1 - 10
of 2 365
pro vyhledávání: '"Torresani, A."'
In the field of vision-language contrastive learning, models such as CLIP capitalize on matched image-caption pairs as positive examples and leverage within-batch non-matching pairs as negatives. This approach has led to remarkable outcomes in zero-s
Externí odkaz:
http://arxiv.org/abs/2407.01408
Autor:
Nagarajan, Tushar, Torresani, Lorenzo
Comparing a user video to a reference how-to video is a key requirement for AR/VR technology delivering personalized assistance tailored to the user's progress. However, current approaches for language-based assistance can only answer questions about
Externí odkaz:
http://arxiv.org/abs/2404.16222
Autor:
Islam, Md Mohaiminul, Ho, Ngan, Yang, Xitong, Nagarajan, Tushar, Torresani, Lorenzo, Bertasius, Gedas
Most video captioning models are designed to process short video clips of few seconds and output text describing low-level visual concepts (e.g., objects, scenes, atomic actions). However, most real-world videos last for minutes or hours and have a c
Externí odkaz:
http://arxiv.org/abs/2402.13250
Autor:
Grauman, Kristen, Westbury, Andrew, Torresani, Lorenzo, Kitani, Kris, Malik, Jitendra, Afouras, Triantafyllos, Ashutosh, Kumar, Baiyya, Vijay, Bansal, Siddhant, Boote, Bikram, Byrne, Eugene, Chavis, Zach, Chen, Joya, Cheng, Feng, Chu, Fu-Jen, Crane, Sean, Dasgupta, Avijit, Dong, Jing, Escobar, Maria, Forigua, Cristhian, Gebreselasie, Abrham, Haresh, Sanjay, Huang, Jing, Islam, Md Mohaiminul, Jain, Suyog, Khirodkar, Rawal, Kukreja, Devansh, Liang, Kevin J, Liu, Jia-Wei, Majumder, Sagnik, Mao, Yongsen, Martin, Miguel, Mavroudi, Effrosyni, Nagarajan, Tushar, Ragusa, Francesco, Ramakrishnan, Santhosh Kumar, Seminara, Luigi, Somayazulu, Arjun, Song, Yale, Su, Shan, Xue, Zihui, Zhang, Edward, Zhang, Jinxu, Castillo, Angela, Chen, Changan, Fu, Xinzhu, Furuta, Ryosuke, Gonzalez, Cristina, Gupta, Prince, Hu, Jiabo, Huang, Yifei, Huang, Yiming, Khoo, Weslie, Kumar, Anush, Kuo, Robert, Lakhavani, Sach, Liu, Miao, Luo, Mi, Luo, Zhengyi, Meredith, Brighid, Miller, Austin, Oguntola, Oluwatumininu, Pan, Xiaqing, Peng, Penny, Pramanick, Shraman, Ramazanova, Merey, Ryan, Fiona, Shan, Wei, Somasundaram, Kiran, Song, Chenan, Southerland, Audrey, Tateno, Masatoshi, Wang, Huiyu, Wang, Yuchen, Yagi, Takuma, Yan, Mingfei, Yang, Xitong, Yu, Zecheng, Zha, Shengxin Cindy, Zhao, Chen, Zhao, Ziwei, Zhu, Zhifan, Zhuo, Jeff, Arbelaez, Pablo, Bertasius, Gedas, Crandall, David, Damen, Dima, Engel, Jakob, Farinella, Giovanni Maria, Furnari, Antonino, Ghanem, Bernard, Hoffman, Judy, Jawahar, C. V., Newcombe, Richard, Park, Hyun Soo, Rehg, James M., Sato, Yoichi, Savva, Manolis, Shi, Jianbo, Shou, Mike Zheng, Wray, Michael
We present Ego-Exo4D, a diverse, large-scale multimodal multiview video dataset and benchmark challenge. Ego-Exo4D centers around simultaneously-captured egocentric and exocentric video of skilled human activities (e.g., sports, music, dance, bike re
Externí odkaz:
http://arxiv.org/abs/2311.18259
Autor:
Marco Giussani, Antonina Orlando, Elena Tassistro, Erminio Torresani, Giulia Lieti, Ilenia Patti, Claudia Colombrita, Ilaria Bulgarelli, Laura Antolini, Gianfranco Parati, Simonetta Genovesi
Publikováno v:
Italian Journal of Pediatrics, Vol 50, Iss 1, Pp 1-9 (2024)
Abstract Background Elevated lipoprotein (Lp(a)) levels are associated with increased risk of atherosclerotic processes and cardiovascular events in adults. The amount of Lp(a) is mainly genetically determined. Therefore, it is important to identify
Externí odkaz:
https://doaj.org/article/b5063d61237743a68975d11618c758cf
Autor:
Tan, Reuben, De Lange, Matthias, Iuzzolino, Michael, Plummer, Bryan A., Saenko, Kate, Ridgeway, Karl, Torresani, Lorenzo
Long-term activity forecasting is an especially challenging research problem because it requires understanding the temporal relationships between observed actions, as well as the variability and complexity of human activities. Despite relying on stro
Externí odkaz:
http://arxiv.org/abs/2307.12854
Publikováno v:
Sampling Theory and Applications (SampTA) 2023, Smita Krishnaswamy, Bastian Rieck, Ian Adelstein and Guy Wolf, Jul 2023, New Haven (Yale University), United States
This paper is concerned with variational and Bayesian approaches to neuro-electromagnetic inverse problems (EEG and MEG). The strong indeterminacy of these problems is tackled by introducing sparsity inducing regularization/priors in a transformed do
Externí odkaz:
http://arxiv.org/abs/2306.15262
Autor:
Warion, Pierre, Torrésani, Bruno
This paper introduces a couple of new time-frequency transforms, designed to adapt their scale to specific features of the analyzed function. Such an adaptation is implemented via so-called focus functions, which control the window scale as a functio
Externí odkaz:
http://arxiv.org/abs/2306.14550
In this paper we present an approach for localizing steps of procedural activities in narrated how-to videos. To deal with the scarcity of labeled data at scale, we source the step descriptions from a language knowledge base (wikiHow) containing inst
Externí odkaz:
http://arxiv.org/abs/2306.03802
Many top-down architectures for instance segmentation achieve significant success when trained and tested on pre-defined closed-world taxonomy. However, when deployed in the open world, they exhibit notable bias towards seen classes and suffer from s
Externí odkaz:
http://arxiv.org/abs/2303.05503