Výsledky vyhledávání

Report

SCAR: Sparse Conditioned Autoencoders for Concept Detection and Steering in LLMs

Autor: Härle, Ruben, Friedrich, Felix, Brack, Manuel, Deiseroth, Björn, Schramowski, Patrick, Kersting, Kristian

Large Language Models (LLMs) have demonstrated remarkable capabilities in generating human-like text, but their output may not be aligned with the user or even produce harmful content. This paper presents a novel approach to detect and steer concepts

Externí odkaz: http://arxiv.org/abs/2411.07122

Zobrazit plný text záznamu

Report

u-$\mu$P: The Unit-Scaled Maximal Update Parametrization

Autor: Blake, Charlie, Eichenberg, Constantin, Dean, Josef, Balles, Lukas, Prince, Luke Y., Deiseroth, Björn, Cruz-Salinas, Andres Felipe, Luschi, Carlo, Weinbach, Samuel, Orr, Douglas

The Maximal Update Parametrization ($\mu$P) aims to make the optimal hyperparameters (HPs) of a model independent of its size, allowing them to be swept using a cheap proxy model rather than the full-size target model. We present a new scheme, u-$\mu

Externí odkaz: http://arxiv.org/abs/2407.17465

Zobrazit plný text záznamu

Akademický článek

Forum: Kündigungsrechtliche Aspekte bei Strafanzeigen gegenüber dem Arbeitgeber – Replik auf Deiseroth, Kündigungsschutz bei Kritik an Missständen in der Altenpflege, AuR 2007, 198

Autor: Binkert, Gerhard

Publikováno v: Arbeit und Recht, 2007 Jan 01. 55(6), 195-197.

Externí odkaz: https://www.jstor.org/stable/24025060

Zobrazit plný text záznamu

Report

T-FREE: Tokenizer-Free Generative LLMs via Sparse Representations for Memory-Efficient Embeddings

Autor: Deiseroth, Björn, Brack, Manuel, Schramowski, Patrick, Kersting, Kristian, Weinbach, Samuel

Tokenizers are crucial for encoding information in Large Language Models, but their development has recently stagnated, and they contain inherent weaknesses. Major limitations include computational overhead, ineffective vocabulary use, and unnecessar

Externí odkaz: http://arxiv.org/abs/2406.19223

Zobrazit plný text záznamu

Akademický článek

Hess. VGH, 2. 12. 1976 — VII OE 72/75 m. Anm. v. RA Deiseroth. Erteilung einer auflagenfreien Aufenthaltserlaubnis an einen griechischen Staatsbürger

Autor: DEISEROTH, Dieter

Publikováno v: JuristenZeitung, 1978 Jan 01. 33(1), 21-27.

Externí odkaz: https://www.jstor.org/stable/20812034

Zobrazit plný text záznamu

Report

Mechanistic Design and Scaling of Hybrid Architectures

Autor: Poli, Michael, Thomas, Armin W, Nguyen, Eric, Ponnusamy, Pragaash, Deiseroth, Björn, Kersting, Kristian, Suzuki, Taiji, Hie, Brian, Ermon, Stefano, Ré, Christopher, Zhang, Ce, Massaroli, Stefano

The development of deep learning architectures is a resource-demanding process, due to a vast design space, long prototyping times, and high compute costs associated with at-scale model training and evaluation. We set out to simplify this process by

Externí odkaz: http://arxiv.org/abs/2403.17844

Zobrazit plný text záznamu

Report

Divergent Token Metrics: Measuring degradation to prune away LLM components -- and optimize quantization

Autor: Deiseroth, Björn, Meuer, Max, Gritsch, Nikolas, Eichenberg, Constantin, Schramowski, Patrick, Aßenmacher, Matthias, Kersting, Kristian

Large Language Models (LLMs) have reshaped natural language processing with their impressive capabilities. However, their ever-increasing size has raised concerns about their effective deployment and the need for LLM compression. This study introduce

Externí odkaz: http://arxiv.org/abs/2311.01544

Zobrazit plný text záznamu

Akademický článek

Hans‐Jörg Deiseroth on the Occasion of his 75^th Birthday.

Autor: Pfitzner, Arno¹ (AUTHOR) arno.pfitzner@chemie.uni‐regensburg.de

Publikováno v: Zeitschrift für Anorganische und Allgemeine Chemie. 2/14/2020, Vol. 646 Issue 3, p80-81. 2p.

Zobrazit plný text záznamu

Plný text ve formátu HTML

Report

MultiFusion: Fusing Pre-Trained Models for Multi-Lingual, Multi-Modal Image Generation

Autor: Bellagente, Marco, Brack, Manuel, Teufel, Hannah, Friedrich, Felix, Deiseroth, Björn, Eichenberg, Constantin, Dai, Andrew, Baldock, Robert, Nanda, Souradeep, Oostermeijer, Koen, Cruz-Salinas, Andres Felipe, Schramowski, Patrick, Kersting, Kristian, Weinbach, Samuel

The recent popularity of text-to-image diffusion models (DM) can largely be attributed to the intuitive interface they provide to users. The intended generation can be expressed in natural language, with the model producing faithful interpretations o

Externí odkaz: http://arxiv.org/abs/2305.15296

Zobrazit plný text záznamu

Report

AtMan: Understanding Transformer Predictions Through Memory Efficient Attention Manipulation

Autor: Deiseroth, Björn, Deb, Mayukh, Weinbach, Samuel, Brack, Manuel, Schramowski, Patrick, Kersting, Kristian

Generative transformer models have become increasingly complex, with large numbers of parameters and the ability to process multiple input modalities. Current methods for explaining their predictions are resource-intensive. Most crucially, they requi

Externí odkaz: http://arxiv.org/abs/2301.08110

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání