Výsledky vyhledávání - "Laudon, James"

Report

Scalable Multi-Domain Adaptation of Language Models using Modular Experts

Autor: Schafhalter, Peter, Liao, Shun, Zhou, Yanqi, Yeh, Chih-Kuan, Kandoor, Arun, Laudon, James

Domain-specific adaptation is critical to maximizing the performance of pre-trained language models (PLMs) on one or multiple targeted tasks, especially under resource-constrained use cases, such as edge devices. However, existing methods often strug

Externí odkaz: http://arxiv.org/abs/2410.10181

Zobrazit plný text záznamu

Report

Brainformers: Trading Simplicity for Efficiency

Autor: Zhou, Yanqi, Du, Nan, Huang, Yanping, Peng, Daiyi, Lan, Chang, Huang, Da, Shakeri, Siamak, So, David, Dai, Andrew, Lu, Yifeng, Chen, Zhifeng, Le, Quoc, Cui, Claire, Laudon, James, Dean, Jeff

Transformers are central to recent successes in natural language processing and computer vision. Transformers have a mostly uniform backbone where layers alternate between feed-forward and self-attention in order to build a deep network. Here we inve

Externí odkaz: http://arxiv.org/abs/2306.00008

Zobrazit plný text záznamu

Report

GiPH: Generalizable Placement Learning for Adaptive Heterogeneous Computing

Autor: Hu, Yi, Zhang, Chaoran, Andert, Edward, Singh, Harshul, Shrivastava, Aviral, Laudon, James, Zhou, Yanqi, Iannucci, Bob, Joe-Wong, Carlee

Careful placement of a computational application within a target device cluster is critical for achieving low application completion time. The problem is challenging due to its NP-hardness and combinatorial nature. In recent years, learning-based app

Externí odkaz: http://arxiv.org/abs/2305.14562

Zobrazit plný text záznamu

Report

Lifelong Language Pretraining with Distribution-Specialized Experts

Autor: Chen, Wuyang, Zhou, Yanqi, Du, Nan, Huang, Yanping, Laudon, James, Chen, Zhifeng, Cu, Claire

Pretraining on a large-scale corpus has become a standard method to build general language models (LMs). Adapting a model to new data distributions targeting different downstream tasks poses significant challenges. Naive fine-tuning may incur catastr

Externí odkaz: http://arxiv.org/abs/2305.12281

Zobrazit plný text záznamu

Report

Mixture-of-Experts with Expert Choice Routing

Autor: Zhou, Yanqi, Lei, Tao, Liu, Hanxiao, Du, Nan, Huang, Yanping, Zhao, Vincent, Dai, Andrew, Chen, Zhifeng, Le, Quoc, Laudon, James

Sparsely-activated Mixture-of-experts (MoE) models allow the number of parameters to greatly increase while keeping the amount of computation for a given token or a given sample unchanged. However, a poor expert routing strategy (e.g. one resulting i

Externí odkaz: http://arxiv.org/abs/2202.09368

Zobrazit plný text záznamu

Report

A Transferable Approach for Partitioning Machine Learning Models on Multi-Chip-Modules

Autor: Xie, Xinfeng, Prabhu, Prakash, Beaugnon, Ulysse, Phothilimthana, Phitchaya Mangpo, Roy, Sudip, Mirhoseini, Azalia, Brevdo, Eugene, Laudon, James, Zhou, Yanqi

Multi-Chip-Modules (MCMs) reduce the design and fabrication cost of machine learning (ML) accelerators while delivering performance and energy efficiency on par with a monolithic large chip. However, ML compilers targeting MCMs need to solve complex

Externí odkaz: http://arxiv.org/abs/2112.04041

Zobrazit plný text záznamu

Report

An Evaluation of Edge TPU Accelerators for Convolutional Neural Networks

Autor: Seshadri, Kiran, Akin, Berkin, Laudon, James, Narayanaswami, Ravi, Yazdanbakhsh, Amir

Edge TPUs are a domain of accelerators for low-power, edge devices and are widely used in various Google products such as Coral and Pixel devices. In this paper, we first discuss the major microarchitectural details of Edge TPUs. Then, we extensively

Externí odkaz: http://arxiv.org/abs/2102.10423

Zobrazit plný text záznamu

Report

Rethinking Co-design of Neural Architectures and Hardware Accelerators

Autor: Zhou, Yanqi, Dong, Xuanyi, Akin, Berkin, Tan, Mingxing, Peng, Daiyi, Meng, Tianjian, Yazdanbakhsh, Amir, Huang, Da, Narayanaswami, Ravi, Laudon, James

Neural architectures and hardware accelerators have been two driving forces for the progress in deep learning. Previous works typically attempt to optimize hardware given a fixed model architecture or model architecture given fixed hardware. And the

Externí odkaz: http://arxiv.org/abs/2102.08619

Zobrazit plný text záznamu

Report

Apollo: Transferable Architecture Exploration

Autor: Yazdanbakhsh, Amir, Angermueller, Christof, Akin, Berkin, Zhou, Yanqi, Jones, Albin, Hashemi, Milad, Swersky, Kevin, Chatterjee, Satrajit, Narayanaswami, Ravi, Laudon, James

The looming end of Moore's Law and ascending use of deep learning drives the design of custom accelerators that are optimized for specific neural architectures. Architecture exploration for such accelerators forms a challenging constrained optimizati

Externí odkaz: http://arxiv.org/abs/2102.01723

Zobrazit plný text záznamu

Report

Transferable Graph Optimizers for ML Compilers

Autor: Zhou, Yanqi, Roy, Sudip, Abdolrashidi, Amirali, Wong, Daniel, Ma, Peter, Xu, Qiumin, Liu, Hanxiao, Phothilimthana, Phitchaya Mangpo, Wang, Shen, Goldie, Anna, Mirhoseini, Azalia, Laudon, James

Publikováno v: NeurIPS 2020

Most compilers for machine learning (ML) frameworks need to solve many correlated optimization problems to generate efficient machine code. Current ML compilers rely on heuristics based algorithms to solve these optimization problems one at a time. H

Externí odkaz: http://arxiv.org/abs/2010.12438

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání