Toward Software-Equivalent Accuracy on Transformer-Based Deep Neural Networks With Analog Memory Devices
Autor: | Pritish Narayanan, Katie Spoon, Charles Mackin, Geoffrey W. Burr, Andrea Fasoli, Stefano Ambrogio, An Chen, Hsinyu Tsai, Malte J. Rasch, Alexander Friz, Milos Stanisavljevic |
---|---|
Jazyk: | angličtina |
Rok vydání: | 2021 |
Předmět: |
Computer science
Neuroscience (miscellaneous) Neurosciences. Biological psychiatry. Neuropsychiatry 02 engineering and technology 01 natural sciences RRAM Cellular and Molecular Neuroscience Software In-Memory Processing in-memory computing 0103 physical sciences Original Research Transformer (machine learning model) 010302 applied physics Transformer business.industry Deep learning 021001 nanoscience & nanotechnology Phase-change memory Computer engineering PCM Benchmark (computing) analog accelerators Artificial intelligence Noise (video) 0210 nano-technology business Encoder Neuroscience BERT DNN RC321-571 |
Zdroj: | Frontiers in Computational Neuroscience, Vol 15 (2021) Frontiers in Computational Neuroscience |
ISSN: | 1662-5188 |
Popis: | Recent advances in deep learning have been driven by ever-increasing model sizes, with networks growing to millions or even billions of parameters. Such enormous models call for fast and energy-efficient hardware accelerators. We study the potential of Analog AI accelerators based on Non-Volatile Memory, in particular Phase Change Memory (PCM), for software-equivalent accurate inference of natural language processing applications. We demonstrate a path to software-equivalent accuracy for the GLUE benchmark on BERT (Bidirectional Encoder Representations from Transformers), by combining noise-aware training to combat inherent PCM drift and noise sources, together with reduced-precision digital attention-block computation down to INT6. |
Databáze: | OpenAIRE |
Externí odkaz: |