Zobrazeno 1 - 10
of 6 173
pro vyhledávání: '"Ramnath, A."'
Time series forecasting is a crucial yet challenging task in machine learning, requiring domain-specific knowledge due to its wide-ranging applications. While recent Transformer models have improved forecasting capabilities, they come with high compu
Externí odkaz:
http://arxiv.org/abs/2412.00994
Document Visual Question Answering (VQA) requires models to interpret textual information within complex visual layouts and comprehend spatial relationships to answer questions based on document images. Existing approaches often lack interpretability
Externí odkaz:
http://arxiv.org/abs/2412.00151
We consider the problem of constructing embeddings of large attributed graphs and supporting multiple downstream learning tasks. We develop a graph embedding method, which is based on extending deep metric and unbiased contrastive learning techniques
Externí odkaz:
http://arxiv.org/abs/2411.13014
Graph-level representations (and clustering/classification based on these representations) are required in a variety of applications. Examples include identifying malicious network traffic, prediction of protein properties, and many others. Often, da
Externí odkaz:
http://arxiv.org/abs/2411.12098
Autor:
Navard, Pouyan, Monsefi, Amin Karimi, Zhou, Mengxi, Chao, Wei-Lun, Yilmaz, Alper, Ramnath, Rajiv
Recent advances in diffusion models have significantly improved text-to-image (T2I) generation, but they often struggle to balance fine-grained precision with high-level control. Methods like ControlNet and T2I-Adapter excel at following sketches by
Externí odkaz:
http://arxiv.org/abs/2410.01595
Autor:
Monsefi, Amin Karimi, Zhou, Mengxi, Monsefi, Nastaran Karimi, Lim, Ser-Nam, Chao, Wei-Lun, Ramnath, Rajiv
We present a novel frequency-based Self-Supervised Learning (SSL) approach that significantly enhances its efficacy for pre-training. Prior work in this direction masks out pre-defined frequencies in the input image and employs a reconstruction loss
Externí odkaz:
http://arxiv.org/abs/2409.10362
In this paper, we introduce DetailCLIP: A Detail-Oriented CLIP to address the limitations of contrastive learning-based vision-language models, particularly CLIP, in handling detail-oriented and fine-grained tasks like segmentation. While CLIP and it
Externí odkaz:
http://arxiv.org/abs/2409.06809
Whisper to normal speech conversion is an active area of research. Various architectures based on generative adversarial networks have been proposed in the recent past. Especially, recent study shows that MaskCycleGAN, which is a mask guided, and cyc
Externí odkaz:
http://arxiv.org/abs/2408.14797
Autor:
Wang, Zhichao, Bi, Bin, Pentyala, Shiva Kumar, Ramnath, Kiran, Chaudhuri, Sougata, Mehrotra, Shubham, Zixu, Zhu, Mao, Xiang-Bo, Asur, Sitaram, Na, Cheng
With advancements in self-supervised learning, the availability of trillions tokens in a pre-training corpus, instruction fine-tuning, and the development of large Transformers with billions of parameters, large language models (LLMs) are now capable
Externí odkaz:
http://arxiv.org/abs/2407.16216
Autor:
Ramnath, Andrecia, Zapp, Korinna
We perform the first systematic study of the effects of multi-parton interactions (MPI's) in the context of jet quenching in heavy-ion collisions with the jet quenching model JEWEL. We use the simple MPI model of PYTHIA 6, on which JEWEL is based. We
Externí odkaz:
http://arxiv.org/abs/2407.03066