Zobrazeno 1 - 10
of 108
pro vyhledávání: '"Laudon, James"'
Domain-specific adaptation is critical to maximizing the performance of pre-trained language models (PLMs) on one or multiple targeted tasks, especially under resource-constrained use cases, such as edge devices. However, existing methods often strug
Externí odkaz:
http://arxiv.org/abs/2410.10181
Autor:
Zhou, Yanqi, Du, Nan, Huang, Yanping, Peng, Daiyi, Lan, Chang, Huang, Da, Shakeri, Siamak, So, David, Dai, Andrew, Lu, Yifeng, Chen, Zhifeng, Le, Quoc, Cui, Claire, Laudon, James, Dean, Jeff
Transformers are central to recent successes in natural language processing and computer vision. Transformers have a mostly uniform backbone where layers alternate between feed-forward and self-attention in order to build a deep network. Here we inve
Externí odkaz:
http://arxiv.org/abs/2306.00008
Autor:
Hu, Yi, Zhang, Chaoran, Andert, Edward, Singh, Harshul, Shrivastava, Aviral, Laudon, James, Zhou, Yanqi, Iannucci, Bob, Joe-Wong, Carlee
Careful placement of a computational application within a target device cluster is critical for achieving low application completion time. The problem is challenging due to its NP-hardness and combinatorial nature. In recent years, learning-based app
Externí odkaz:
http://arxiv.org/abs/2305.14562
Pretraining on a large-scale corpus has become a standard method to build general language models (LMs). Adapting a model to new data distributions targeting different downstream tasks poses significant challenges. Naive fine-tuning may incur catastr
Externí odkaz:
http://arxiv.org/abs/2305.12281
Autor:
Zhou, Yanqi, Lei, Tao, Liu, Hanxiao, Du, Nan, Huang, Yanping, Zhao, Vincent, Dai, Andrew, Chen, Zhifeng, Le, Quoc, Laudon, James
Sparsely-activated Mixture-of-experts (MoE) models allow the number of parameters to greatly increase while keeping the amount of computation for a given token or a given sample unchanged. However, a poor expert routing strategy (e.g. one resulting i
Externí odkaz:
http://arxiv.org/abs/2202.09368
Autor:
Xie, Xinfeng, Prabhu, Prakash, Beaugnon, Ulysse, Phothilimthana, Phitchaya Mangpo, Roy, Sudip, Mirhoseini, Azalia, Brevdo, Eugene, Laudon, James, Zhou, Yanqi
Multi-Chip-Modules (MCMs) reduce the design and fabrication cost of machine learning (ML) accelerators while delivering performance and energy efficiency on par with a monolithic large chip. However, ML compilers targeting MCMs need to solve complex
Externí odkaz:
http://arxiv.org/abs/2112.04041
Edge TPUs are a domain of accelerators for low-power, edge devices and are widely used in various Google products such as Coral and Pixel devices. In this paper, we first discuss the major microarchitectural details of Edge TPUs. Then, we extensively
Externí odkaz:
http://arxiv.org/abs/2102.10423
Autor:
Zhou, Yanqi, Dong, Xuanyi, Akin, Berkin, Tan, Mingxing, Peng, Daiyi, Meng, Tianjian, Yazdanbakhsh, Amir, Huang, Da, Narayanaswami, Ravi, Laudon, James
Neural architectures and hardware accelerators have been two driving forces for the progress in deep learning. Previous works typically attempt to optimize hardware given a fixed model architecture or model architecture given fixed hardware. And the
Externí odkaz:
http://arxiv.org/abs/2102.08619
Autor:
Yazdanbakhsh, Amir, Angermueller, Christof, Akin, Berkin, Zhou, Yanqi, Jones, Albin, Hashemi, Milad, Swersky, Kevin, Chatterjee, Satrajit, Narayanaswami, Ravi, Laudon, James
The looming end of Moore's Law and ascending use of deep learning drives the design of custom accelerators that are optimized for specific neural architectures. Architecture exploration for such accelerators forms a challenging constrained optimizati
Externí odkaz:
http://arxiv.org/abs/2102.01723
Autor:
Zhou, Yanqi, Roy, Sudip, Abdolrashidi, Amirali, Wong, Daniel, Ma, Peter, Xu, Qiumin, Liu, Hanxiao, Phothilimthana, Phitchaya Mangpo, Wang, Shen, Goldie, Anna, Mirhoseini, Azalia, Laudon, James
Publikováno v:
NeurIPS 2020
Most compilers for machine learning (ML) frameworks need to solve many correlated optimization problems to generate efficient machine code. Current ML compilers rely on heuristics based algorithms to solve these optimization problems one at a time. H
Externí odkaz:
http://arxiv.org/abs/2010.12438