Zobrazeno 1 - 10
of 26
pro vyhledávání: '"Bae, Juhan"'
Autor:
Mlodozeniec, Bruno, Eschenhagen, Runa, Bae, Juhan, Immer, Alexander, Krueger, David, Turner, Richard
Diffusion models have led to significant advancements in generative modelling. Yet their widespread adoption poses challenges regarding data attribution and interpretability. In this paper, we aim to help address such challenges in diffusion models b
Externí odkaz:
http://arxiv.org/abs/2410.13850
Autor:
Choe, Sang Keun, Ahn, Hwijeen, Bae, Juhan, Zhao, Kewen, Kang, Minsoo, Chung, Youngseog, Pratapa, Adithya, Neiswanger, Willie, Strubell, Emma, Mitamura, Teruko, Schneider, Jeff, Hovy, Eduard, Grosse, Roger, Xing, Eric
Large language models (LLMs) are trained on a vast amount of human-written data, but data providers often remain uncredited. In response to this issue, data valuation (or data attribution), which quantifies the contribution or value of each data to t
Externí odkaz:
http://arxiv.org/abs/2405.13954
Many training data attribution (TDA) methods aim to estimate how a model's behavior would change if one or more data points were removed from the training set. Methods based on implicit differentiation, such as influence functions, can be made comput
Externí odkaz:
http://arxiv.org/abs/2405.12186
Adaptive gradient optimizers like Adam(W) are the default training algorithms for many deep learning architectures, such as transformers. Their diagonal preconditioner is based on the gradient outer product which is incorporated into the parameter up
Externí odkaz:
http://arxiv.org/abs/2402.03496
This paper studies using foundational large language models (LLMs) to make decisions during hyperparameter optimization (HPO). Empirical evaluations demonstrate that in settings with constrained search budgets, LLMs can perform comparably or better t
Externí odkaz:
http://arxiv.org/abs/2312.04528
Autor:
Grosse, Roger, Bae, Juhan, Anil, Cem, Elhage, Nelson, Tamkin, Alex, Tajdini, Amirhossein, Steiner, Benoit, Li, Dustin, Durmus, Esin, Perez, Ethan, Hubinger, Evan, Lukošiūtė, Kamilė, Nguyen, Karina, Joseph, Nicholas, McCandlish, Sam, Kaplan, Jared, Bowman, Samuel R.
When trying to gain better visibility into a machine learning model in order to understand and mitigate the associated risks, a potentially valuable source of evidence is: which training examples most contribute to a given behavior? Influence functio
Externí odkaz:
http://arxiv.org/abs/2308.03296
Autor:
Dahl, George E., Schneider, Frank, Nado, Zachary, Agarwal, Naman, Sastry, Chandramouli Shama, Hennig, Philipp, Medapati, Sourabh, Eschenhagen, Runa, Kasimbeg, Priya, Suo, Daniel, Bae, Juhan, Gilmer, Justin, Peirson, Abel L., Khan, Bilal, Anil, Rohan, Rabbat, Mike, Krishnan, Shankar, Snider, Daniel, Amid, Ehsan, Chen, Kongtao, Maddison, Chris J., Vasudev, Rakshith, Badura, Michal, Garg, Ankush, Mattson, Peter
Training algorithms, broadly construed, are an essential part of every deep learning pipeline. Training algorithm improvements that speed up training across a wide variety of workloads (e.g., better update rules, tuning protocols, learning rate sched
Externí odkaz:
http://arxiv.org/abs/2306.07179
It is often useful to compactly summarize important properties of model parameters and training data so that they can be used later without storing and/or iterating over the entire dataset. As a specific case, we consider estimating the Function Spac
Externí odkaz:
http://arxiv.org/abs/2302.03519
Autor:
Bae, Juhan, Zhang, Michael R., Ruan, Michael, Wang, Eric, Hasegawa, So, Ba, Jimmy, Grosse, Roger
Variational autoencoders (VAEs) are powerful tools for learning latent representations of data used in a wide range of applications. In practice, VAEs usually require multiple training rounds to choose the amount of information the latent variable sh
Externí odkaz:
http://arxiv.org/abs/2212.03905
Influence functions efficiently estimate the effect of removing a single training data point on a model's learned parameters. While influence estimates align well with leave-one-out retraining for linear models, recent works have shown this alignment
Externí odkaz:
http://arxiv.org/abs/2209.05364