Leveraging Fine-grained Structured Sparsity for CNN Inference on Systolic Array Architectures

Autor: Linqiao Liu, Stephen J. Brown
Rok vydání: 2021
Zdroj: FPL
Popis: The high computational complexity of convolutional neural networks (CNNs) has motivated many studies of accelerating CNN inference on field-programmable gate arrays (FPGAs). Among these, designs that feature systolic arrays can effectively leverage the parallelism in CNNs while acheiving good placement and routing quality. Weight sparsity – the presence of zeros in CNN weights – can further reduce the number of necessary multiply-accumulate (MAC) operations in CNNs, but has yet resulted in performance gain on systolic arrays. In this work, we propose a novel fine-grained structured weight sparsity pattern, showcase a processing element (PE) design that leverages this sparsity pattern, and develop a systolic array CNN inference accelerator that targets an Intel Arria 10 GX1150 FPGA. When evaluated on ResNet-50 and VGG-16 that are trained and pruned on the ImageNet dataset, our accelerator achieves 2.26 TOPs/s and 1.21 TOPs/s, respectively, on the MAC operations, while keeping the top-l accuracy degradation within 5%. These results translate to $2.86\times$ and $1.75\times$ speed-up compared to a dense systolic array baseline.
Databáze: OpenAIRE