Zobrazeno 1 - 10
of 22 048
pro vyhledávání: '"Cherian BE"'
Physical reasoning is an important skill needed for robotic agents when operating in the real world. However, solving such reasoning problems often involves hypothesizing and reflecting over complex multi-body interactions under the effect of a multi
Externí odkaz:
http://arxiv.org/abs/2411.08027
Accurate survival prediction is essential for personalized cancer treatment. However, genomic data - often a more powerful predictor than pathology data - is costly and inaccessible. We present the cross-modal genomic feature translation and alignmen
Externí odkaz:
http://arxiv.org/abs/2411.00749
Autor:
Patil, Abhijeet, Diwakar, Harsh, Sawant, Jay, Kurian, Nikhil Cherian, Yadav, Subhash, Rane, Swapnil, Bameta, Tripti, Sethi, Amit
Publikováno v:
Journal of Pathology Informatics, 2023
Histopathology whole slide images (WSIs) are being widely used to develop deep learning-based diagnostic solutions, especially for precision oncology. Most of these diagnostic softwares are vulnerable to biases and impurities in the training and test
Externí odkaz:
http://arxiv.org/abs/2409.19587
Autor:
Binu, Sona, Jose, Jismi, K V, Fathima Shimna, Hans, Alino Luke, Cherian, Reni K., Alex, Starlet Ben, Srivastava, Priyanka, Yarra, Chiranjeevi
The people with Major Depressive Disorder (MDD) exhibit the symptoms of tonal variations in their speech compared to the healthy counterparts. However, these tonal variations not only confine to the state of MDD but also on the language, which has un
Externí odkaz:
http://arxiv.org/abs/2409.14769
Autor:
Zhang, Jiahao, Zhang, Frederic Z., Rodriguez, Cristian, Ben-Shabat, Yizhak, Cherian, Anoop, Gould, Stephen
We study the challenging problem of simultaneously localizing a sequence of queries in the form of instructional diagrams in a video. This requires understanding not only the individual queries but also their interrelationships. However, most existin
Externí odkaz:
http://arxiv.org/abs/2407.12066
Autor:
Yin, Jie, Luo, Andrew, Du, Yilun, Cherian, Anoop, Marks, Tim K., Roux, Jonathan Le, Gan, Chuang
We study the problem of multimodal physical scene understanding, where an embodied agent needs to find fallen objects by inferring object properties, direction, and distance of an impact sound source. Previous works adopt feed-forward neural networks
Externí odkaz:
http://arxiv.org/abs/2407.11333
Radiology reports are highly technical documents aimed primarily at doctor-doctor communication. There has been an increasing interest in sharing those reports with patients, necessitating providing them patient-friendly simplifications of the origin
Externí odkaz:
http://arxiv.org/abs/2406.18859
Autor:
Cherian, Anoop, Peng, Kuan-Chuan, Lohit, Suhas, Matthiesen, Joanna, Smith, Kevin, Tenenbaum, Joshua B.
Recent years have seen a significant progress in the general-purpose problem solving abilities of large vision and language models (LVLMs), such as ChatGPT, Gemini, etc.; some of these breakthroughs even seem to enable AI models to outperform human a
Externí odkaz:
http://arxiv.org/abs/2406.15736
We develop new conformal inference methods for obtaining validity guarantees on the output of large language models (LLMs). Prior work in conformal language modeling identifies a subset of the text that satisfies a high-probability guarantee of corre
Externí odkaz:
http://arxiv.org/abs/2406.09714
Autor:
Ni, Haomiao, Egger, Bernhard, Lohit, Suhas, Cherian, Anoop, Wang, Ye, Koike-Akino, Toshiaki, Huang, Sharon X., Marks, Tim K.
Text-conditioned image-to-video generation (TI2V) aims to synthesize a realistic video starting from a given image (e.g., a woman's photo) and a text description (e.g., "a woman is drinking water."). Existing TI2V frameworks often require costly trai
Externí odkaz:
http://arxiv.org/abs/2404.16306