Zobrazeno 1 - 10
of 70
pro vyhledávání: '"Jain, Jitesh"'
Autor:
Li, Jiachen, Wang, Xinyao, Zhu, Sijie, Kuo, Chia-Wen, Xu, Lu, Chen, Fan, Jain, Jitesh, Shi, Humphrey, Wen, Longyin
Recent advancements in Multimodal Large Language Models (LLMs) have focused primarily on scaling by increasing text-image pair data and enhancing LLMs to improve performance on multimodal tasks. However, these scaling approaches are computationally e
Externí odkaz:
http://arxiv.org/abs/2405.05949
The Common Objects in Context (COCO) dataset has been instrumental in benchmarking object detectors over the past decade. Like every dataset, COCO contains subtle errors and imperfections stemming from its annotation procedure. With the advent of hig
Externí odkaz:
http://arxiv.org/abs/2403.18819
Humans possess the remarkable skill of Visual Perception, the ability to see and understand the seen, helping them make sense of the visual world and, in turn, reason. Multimodal Large Language Models (MLLM) have recently achieved impressive performa
Externí odkaz:
http://arxiv.org/abs/2312.14233
In this paper, we propose the Matting Anything Model (MAM), an efficient and versatile framework for estimating the alpha matte of any instance in an image with flexible and interactive visual or linguistic user prompt guidance. MAM offers several si
Externí odkaz:
http://arxiv.org/abs/2306.05399
Universal Image Segmentation is not a new concept. Past attempts to unify image segmentation in the last decades include scene parsing, panoptic segmentation, and, more recently, new panoptic architectures. However, such panoptic architectures do not
Externí odkaz:
http://arxiv.org/abs/2211.06220
Deep image inpainting has made impressive progress with recent advances in image generation and processing algorithms. We claim that the performance of inpainting algorithms can be better judged by the generated structures and textures. Structures re
Externí odkaz:
http://arxiv.org/abs/2208.03382
Autor:
Jain, Jitesh, Singh, Anukriti, Orlov, Nikita, Huang, Zilong, Li, Jiachen, Walton, Steven, Shi, Humphrey
Finetuning a pretrained backbone in the encoder part of an image transformer network has been the traditional approach for the semantic segmentation task. However, such an approach leaves out the semantic context that an image provides during the enc
Externí odkaz:
http://arxiv.org/abs/2112.12782
Recent approaches for learning policies to improve caching, target just one out of the prefetching, admission and eviction processes. In contrast, we propose an end to end pipeline to learn all three policies using machine learning. We also take insp
Externí odkaz:
http://arxiv.org/abs/2009.09206
Publikováno v:
In The Knee June 2014 21(3):726-730
Publikováno v:
In Heart Rhythm May 2023 20(5) Supplement:S344-S345