Factored Radix-8 Systolic Array for Tensor Processing

Autor: Inayat Ullah, Joon-Sung Yang, Kashif Inayat, Jaeyong Chung
Rok vydání: 2020
Předmět:
Zdroj: DAC
DOI: 10.1109/dac18072.2020.9218585
Popis: Systolic arrays are re-gaining the attention as the heart to accelerate machine learning workloads. This paper shows that a large design space exists at the logic level despite the simple structure of systolic arrays and proposes a novel systolic array based on factoring and radix-8 multipliers. The factored systolic array (FSA) extracts out the booth encoding and the hard-multiple generation which is common across all processing elements, reducing the delay and the area of the whole systolic array. This factoring is done at the cost of an increased number of registers, however, the reduced pipeline register requirement in radix-8 offsets this effect. The proposed factored 16-bit multiplier achieves up to 15%, 13%, and 23% better delay, area, and power, respectively, compared with the radix-4 multipliers even if the register overhead is included. The proposed FSA architecture improves delay, area, and power up to 11%, 20% and 31%, respectively, for different bitwidths when compared with the conventional radix-4 systolic array.
Databáze: OpenAIRE