An acoustic DSP processor with reconfigurable CNN/FFT accelerators for speech enhancement

Autor: Lee, Yu-Chi, 李諭奇
Rok vydání: 2018
Druh dokumentu: 學位論文 ; thesis
Popis: 106
This paper proposes an acoustic DSP processor with a machine learning core for speech enhancement. The accelerators for convolutional neural networks (CNN) and fast Fourier transform (FFT) computations are embedded in the acoustic processor. The CNN-based speech enhancement algorithm takes the speech signal’s spectrogram as the model's input, and predicts the desired mask of speech to enhance speech intelligibility after passing through the CNN model. In the CNN model, we consider outputs of each layer as frame-like products. Since frames can share different inputs, the processor can process the frames needed in advance to reduce computation complexity. We also apply weight sharing in CNN to reduce model complexity, and use weight quantization to reduce memory size. To optimize hardware complexity, we use numerous MAC and CORDIC engines to deal with linear and nonlinear function in CNN and FFT by parallel processing, and apply hardware sharing to reduce chip area due to the similarity between CNN and FFT computations. Input sharing and network arrangement are used to reduce the volume of data movement within the hardware. The proposed DSP processor chip is designed and fabricated in a 40 nm CMOS process with a core area of 4.2 mm^2. The chip’s power dissipation is 80 mW at an operating frequency of 20 MHz. This is the first DSP processor applied speech enhancement based on machine learning techniques. Compared to non-processed speech signals, the enhanced speech achieves 30~41% higher speech intelligibility under bad SNR conditions.
Databáze: Networked Digital Library of Theses & Dissertations