Výsledky vyhledávání - "Ojha, Utkarsh"

Report

On the Effectiveness of Dataset Alignment for Fake Image Detection

Autor: Rajan, Anirudh Sundara, Ojha, Utkarsh, Schloesser, Jedidiah, Lee, Yong Jae

As latent diffusion models (LDMs) democratize image generation capabilities, there is a growing need to detect fake images. A good detector should focus on the generative models fingerprints while ignoring image properties such as semantic content, r

Externí odkaz: http://arxiv.org/abs/2410.11835

Zobrazit plný text záznamu

Report

Yo'LLaVA: Your Personalized Language and Vision Assistant

Autor: Nguyen, Thao, Liu, Haotian, Li, Yuheng, Cai, Mu, Ojha, Utkarsh, Lee, Yong Jae

Large Multimodal Models (LMMs) have shown remarkable capabilities across a variety of tasks (e.g., image captioning, visual question answering). While broad, their knowledge remains generic (e.g., recognizing a dog), and they are unable to handle per

Externí odkaz: http://arxiv.org/abs/2406.09400

Zobrazit plný text záznamu

Report

Visual Instruction Inversion: Image Editing via Visual Prompting

Autor: Nguyen, Thao, Li, Yuheng, Ojha, Utkarsh, Lee, Yong Jae

Text-conditioned image editing has emerged as a powerful tool for editing images. However, in many situations, language can be ambiguous and ineffective in describing specific image edits. When faced with such challenges, visual prompts can be a more

Externí odkaz: http://arxiv.org/abs/2307.14331

Zobrazit plný text záznamu

Report

Leveraging Large Language Models for Scalable Vector Graphics-Driven Image Understanding

Autor: Cai, Mu, Huang, Zeyi, Li, Yuheng, Ojha, Utkarsh, Wang, Haohan, Lee, Yong Jae

Large language models (LLMs) have made significant advancements in natural language understanding. However, through that enormous semantic representation that the LLM has learnt, is it somehow possible for it to understand images as well? This work i

Externí odkaz: http://arxiv.org/abs/2306.06094

Zobrazit plný text záznamu

Report

Towards Universal Fake Image Detectors that Generalize Across Generative Models

Autor: Ojha, Utkarsh, Li, Yuheng, Lee, Yong Jae

With generative models proliferating at a rapid rate, there is a growing need for general purpose fake image detectors. In this work, we first show that the existing paradigm, which consists of training a deep network for real-vs-fake classification,

Externí odkaz: http://arxiv.org/abs/2302.10174

Zobrazit plný text záznamu

Report

What Knowledge Gets Distilled in Knowledge Distillation?

Autor: Ojha, Utkarsh, Li, Yuheng, Rajan, Anirudh Sundara, Liang, Yingyu, Lee, Yong Jae

Knowledge distillation aims to transfer useful information from a teacher network to a student network, with the primary goal of improving the student's performance for the task at hand. Over the years, there has a been a deluge of novel techniques a

Externí odkaz: http://arxiv.org/abs/2205.16004

Zobrazit plný text záznamu

Report

Few-shot Image Generation via Cross-domain Correspondence

Autor: Ojha, Utkarsh, Li, Yijun, Lu, Jingwan, Efros, Alexei A., Lee, Yong Jae, Shechtman, Eli, Zhang, Richard

Training generative models, such as GANs, on a target domain containing limited examples (e.g., 10) can easily result in overfitting. In this work, we seek to utilize a large source domain for pretraining and transfer the diversity information from s

Externí odkaz: http://arxiv.org/abs/2104.06820

Zobrazit plný text záznamu

Report

Generating Furry Cars: Disentangling Object Shape & Appearance across Multiple Domains

Autor: Ojha, Utkarsh, Singh, Krishna Kumar, Lee, Yong Jae

We consider the novel task of learning disentangled representations of object shape and appearance across multiple domains (e.g., dogs and cars). The goal is to learn a generative model that learns an intermediate distribution, which borrows a subset

Externí odkaz: http://arxiv.org/abs/2104.02052

Zobrazit plný text záznamu

Report

MixNMatch: Multifactor Disentanglement and Encoding for Conditional Image Generation

Autor: Li, Yuheng, Singh, Krishna Kumar, Ojha, Utkarsh, Lee, Yong Jae

Publikováno v: CVPR 2020

We present MixNMatch, a conditional generative model that learns to disentangle and encode background, object pose, shape, and texture from real images with minimal supervision, for mix-and-match image generation. We build upon FineGAN, an unconditio

Externí odkaz: http://arxiv.org/abs/1911.11758

Zobrazit plný text záznamu

Report

Elastic-InfoGAN: Unsupervised Disentangled Representation Learning in Class-Imbalanced Data

Autor: Ojha, Utkarsh, Singh, Krishna Kumar, Hsieh, Cho-Jui, Lee, Yong Jae

We propose a novel unsupervised generative model that learns to disentangle object identity from other low-level aspects in class-imbalanced data. We first investigate the issues surrounding the assumptions about uniformity made by InfoGAN, and demon

Externí odkaz: http://arxiv.org/abs/1910.01112

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání