Výsledky vyhledávání - "Baumann, Stefan"

Report

CTRLorALTer: Conditional LoRAdapter for Efficient 0-Shot Control & Altering of T2I Models

Autor: Stracke, Nick, Baumann, Stefan Andreas, Susskind, Joshua M., Bautista, Miguel Angel, Ommer, Björn

Text-to-image generative models have become a prominent and powerful tool that excels at generating high-resolution realistic images. However, guiding the generative process of these models to consider detailed forms of conditioning reflecting style

Externí odkaz: http://arxiv.org/abs/2405.07913

Zobrazit plný text záznamu

Report

Continuous, Subject-Specific Attribute Control in T2I Models by Identifying Semantic Directions

Autor: Baumann, Stefan Andreas, Krause, Felix, Neumayr, Michael, Stracke, Nick, Hu, Vincent Tao, Ommer, Björn

In recent years, advances in text-to-image (T2I) diffusion models have substantially elevated the quality of their generated images. However, achieving fine-grained control over attributes remains a challenge due to the limitations of natural languag

Externí odkaz: http://arxiv.org/abs/2403.17064

Zobrazit plný text záznamu

Report

ZigMa: A DiT-style Zigzag Mamba Diffusion Model

Autor: Hu, Vincent Tao, Baumann, Stefan Andreas, Gui, Ming, Grebenkova, Olga, Ma, Pingchuan, Fischer, Johannes, Ommer, Björn

The diffusion model has long been plagued by scalability and quadratic complexity issues, especially within transformer-based structures. In this study, we aim to leverage the long sequence modeling capability of a State-Space Model called Mamba to e

Externí odkaz: http://arxiv.org/abs/2403.13802

Zobrazit plný text záznamu

Report

DepthFM: Fast Monocular Depth Estimation with Flow Matching

Autor: Gui, Ming, Fischer, Johannes S., Prestel, Ulrich, Ma, Pingchuan, Kotovenko, Dmytro, Grebenkova, Olga, Baumann, Stefan Andreas, Hu, Vincent Tao, Ommer, Björn

Monocular depth estimation is crucial for numerous downstream vision tasks and applications. Current discriminative approaches to this problem are limited due to blurry artifacts, while state-of-the-art generative methods suffer from slow sampling du

Externí odkaz: http://arxiv.org/abs/2403.13788

Zobrazit plný text záznamu

Report

Scalable High-Resolution Pixel-Space Image Synthesis with Hourglass Diffusion Transformers

Autor: Crowson, Katherine, Baumann, Stefan Andreas, Birch, Alex, Abraham, Tanishq Mathew, Kaplan, Daniel Z., Shippole, Enrico

We present the Hourglass Diffusion Transformer (HDiT), an image generative model that exhibits linear scaling with pixel count, supporting training at high-resolution (e.g. $1024 \times 1024$) directly in pixel-space. Building on the Transformer arch

Externí odkaz: http://arxiv.org/abs/2401.11605

Zobrazit plný text záznamu

Report

Boosting Latent Diffusion with Flow Matching

Autor: Fischer, Johannes S., Gui, Ming, Ma, Pingchuan, Stracke, Nick, Baumann, Stefan A., Ommer, Björn

Recently, there has been tremendous progress in visual synthesis and the underlying generative models. Here, diffusion models (DMs) stand out particularly, but lately, flow matching (FM) has also garnered considerable interest. While DMs excel in pro

Externí odkaz: http://arxiv.org/abs/2312.07360

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání