Výsledky vyhledávání

Report

Thinking LLMs: General Instruction Following with Thought Generation

Autor: Wu, Tianhao, Lan, Janice, Yuan, Weizhe, Jiao, Jiantao, Weston, Jason, Sukhbaatar, Sainbayar

LLMs are typically trained to answer user questions or follow instructions similarly to how human experts respond. However, in the standard alignment framework they lack the basic ability of explicit thinking before answering. Thinking is important f

Externí odkaz: http://arxiv.org/abs/2410.10630

Zobrazit plný text záznamu

Report

Improved Uncertainty Estimation of Graph Neural Network Potentials Using Engineered Latent Space Distances

Autor: Musielewicz, Joseph, Lan, Janice, Uyttendaele, Matt, Kitchin, John R.

Graph neural networks (GNNs) have been shown to be astonishingly capable models for molecular property prediction, particularly as surrogates for expensive density functional theory calculations of relaxed energy for novel material discovery. However

Externí odkaz: http://arxiv.org/abs/2407.10844

Zobrazit plný text záznamu

Report

AdsorbML: A Leap in Efficiency for Adsorption Energy Calculations using Generalizable Machine Learning Potentials

Autor: Lan, Janice, Palizhati, Aini, Shuaibi, Muhammed, Wood, Brandon M., Wander, Brook, Das, Abhishek, Uyttendaele, Matt, Zitnick, C. Lawrence, Ulissi, Zachary W.

Computational catalysis is playing an increasingly significant role in the design of catalysts across a wide range of applications. A common task for many computational methods is the need to accurately compute the adsorption energy for an adsorbate

Externí odkaz: http://arxiv.org/abs/2211.16486

Zobrazit plný text záznamu

Report

Spherical Channels for Modeling Atomic Interactions

Autor: Zitnick, C. Lawrence, Das, Abhishek, Kolluru, Adeesh, Lan, Janice, Shuaibi, Muhammed, Sriram, Anuroop, Ulissi, Zachary, Wood, Brandon

Modeling the energy and forces of atomic systems is a fundamental problem in computational chemistry with the potential to help address many of the world's most pressing problems, including those related to energy scarcity and climate change. These c

Externí odkaz: http://arxiv.org/abs/2206.14331

Zobrazit plný text záznamu

Report

The Open Catalyst 2022 (OC22) Dataset and Challenges for Oxide Electrocatalysts

Autor: Tran, Richard, Lan, Janice, Shuaibi, Muhammed, Wood, Brandon M., Goyal, Siddharth, Das, Abhishek, Heras-Domingo, Javier, Kolluru, Adeesh, Rizvi, Ammar, Shoghi, Nima, Sriram, Anuroop, Therrien, Felix, Abed, Jehad, Voznyy, Oleksandr, Sargent, Edward H., Ulissi, Zachary, Zitnick, C. Lawrence

The development of machine learning models for electrocatalysts requires a broad set of training data to enable their use across a wide variety of materials. One class of materials that currently lacks sufficient training data is oxides, which are cr

Externí odkaz: http://arxiv.org/abs/2206.08917

Zobrazit plný text záznamu

Report

Plug and Play Language Models: A Simple Approach to Controlled Text Generation

Autor: Dathathri, Sumanth, Madotto, Andrea, Lan, Janice, Hung, Jane, Frank, Eric, Molino, Piero, Yosinski, Jason, Liu, Rosanne

Large transformer-based language models (LMs) trained on huge text corpora have shown unparalleled generation capabilities. However, controlling attributes of the generated language (e.g. switching topic or sentiment) is difficult without modifying t

Externí odkaz: http://arxiv.org/abs/1912.02164

Zobrazit plný text záznamu

Report

First-Order Preconditioning via Hypergradient Descent

Autor: Moskovitz, Ted, Wang, Rui, Lan, Janice, Kapoor, Sanyam, Miconi, Thomas, Yosinski, Jason, Rawal, Aditya

Standard gradient descent methods are susceptible to a range of issues that can impede training, such as high correlations and different scaling in parameter space.These difficulties can be addressed by second-order approaches that apply a pre-condit

Externí odkaz: http://arxiv.org/abs/1910.08461

Zobrazit plný text záznamu

Report

LCA: Loss Change Allocation for Neural Network Training

Autor: Lan, Janice, Liu, Rosanne, Zhou, Hattie, Yosinski, Jason

Neural networks enjoy widespread use, but many aspects of their training, representation, and operation are poorly understood. In particular, our view into the training process is limited, with a single scalar loss being the most common viewport into

Externí odkaz: http://arxiv.org/abs/1909.01440

Zobrazit plný text záznamu

Report

Deconstructing Lottery Tickets: Zeros, Signs, and the Supermask

Autor: Zhou, Hattie, Lan, Janice, Liu, Rosanne, Yosinski, Jason

The recent "Lottery Ticket Hypothesis" paper by Frankle & Carbin showed that a simple approach to creating sparse networks (keeping the large weights) results in models that are trainable from scratch, but only when starting from the same initial wei

Externí odkaz: http://arxiv.org/abs/1905.01067

Zobrazit plný text záznamu

Akademický článek

Tento výsledek nelze pro nepřihlášené uživatele zobrazit.
K zobrazení výsledku je třeba se přihlásit.

Vyhledávací nástroje:

Upřesnit hledání