Zobrazeno 1 - 10
of 10
pro vyhledávání: '"Sampat, Shailaja Keyur"'
Autor:
Sampat, Shailaja Keyur, Nakamura, Mutsumi, Kailas, Shankar, Aggarwal, Kartik, Zhou, Mandy, Yang, Yezhou, Baral, Chitta
Deriving inference from heterogeneous inputs (such as images, text, and audio) is an important skill for humans to perform day-to-day tasks. A similar ability is desirable for the development of advanced Artificial Intelligence (AI) systems. While st
Externí odkaz:
http://arxiv.org/abs/2410.13666
Humans observe various actions being performed by other humans (physically or in videos/images) and can draw a wide range of inferences about it beyond what they can visually perceive. Such inferences include determining the aspects of the world that
Externí odkaz:
http://arxiv.org/abs/2410.13662
An ability to learn about new objects from a small amount of visual data and produce convincing linguistic justification about the presence/absence of certain concepts (that collectively compose the object) in novel scenarios is an important characte
Externí odkaz:
http://arxiv.org/abs/2410.13651
'Actions' play a vital role in how humans interact with the world. Thus, autonomous agents that would assist us in everyday tasks also require the capability to perform 'Reasoning about Actions & Change' (RAC). This has been an important research dir
Externí odkaz:
http://arxiv.org/abs/2212.03866
'Actions' play a vital role in how humans interact with the world. Thus, autonomous agents that would assist us in everyday tasks also require the capability to perform 'Reasoning about Actions & Change' (RAC). Recently, there has been growing intere
Externí odkaz:
http://arxiv.org/abs/2212.03433
'Actions' play a vital role in how humans interact with the world and enable them to achieve desired goals. As a result, most common sense (CS) knowledge for humans revolves around actions. While 'Reasoning about Actions & Change' (RAC) has been wide
Externí odkaz:
http://arxiv.org/abs/2207.07568
Autor:
Wang, Yizhong, Mishra, Swaroop, Alipoormolabashi, Pegah, Kordi, Yeganeh, Mirzaei, Amirreza, Arunkumar, Anjana, Ashok, Arjun, Dhanasekaran, Arut Selvan, Naik, Atharva, Stap, David, Pathak, Eshaan, Karamanolakis, Giannis, Lai, Haizhi Gary, Purohit, Ishan, Mondal, Ishani, Anderson, Jacob, Kuznia, Kirby, Doshi, Krima, Patel, Maitreya, Pal, Kuntal Kumar, Moradshahi, Mehrad, Parmar, Mihir, Purohit, Mirali, Varshney, Neeraj, Kaza, Phani Rohitha, Verma, Pulkit, Puri, Ravsehaj Singh, Karia, Rushang, Sampat, Shailaja Keyur, Doshi, Savan, Mishra, Siddhartha, Reddy, Sujan, Patro, Sumanta, Dixit, Tanay, Shen, Xudong, Baral, Chitta, Choi, Yejin, Smith, Noah A., Hajishirzi, Hannaneh, Khashabi, Daniel
How well can NLP models generalize to a variety of unseen tasks when provided with task instructions? To address this question, we first introduce Super-NaturalInstructions, a benchmark of 1,616 diverse NLP tasks and their expert-written instructions
Externí odkaz:
http://arxiv.org/abs/2204.07705
Most existing research on visual question answering (VQA) is limited to information explicitly present in an image or a video. In this paper, we take visual understanding to a higher level where systems are challenged to answer questions that involve
Externí odkaz:
http://arxiv.org/abs/2104.05981
Autor:
Luo, Man, Sampat, Shailaja Keyur, Tallman, Riley, Zeng, Yankai, Vancha, Manuha, Sajja, Akarshan, Baral, Chitta
GQA~\citep{hudson2019gqa} is a dataset for real-world visual reasoning and compositional question answering. We found that many answers predicted by the best vision-language models on the GQA dataset do not match the ground-truth answer but still are
Externí odkaz:
http://arxiv.org/abs/2103.15022
Understanding images and text together is an important aspect of cognition and building advanced Artificial Intelligence (AI) systems. As a community, we have achieved good benchmarks over language and vision domains separately, however joint reasoni
Externí odkaz:
http://arxiv.org/abs/2005.00330