Fast AES Implementation: A High-Throughput Bitsliced Approach
Autor: | Seyed Hossein Khasteh, Saleh Khalaj Monfared, Omid Hajihassani, Saeid Gorgin |
---|---|
Rok vydání: | 2019 |
Předmět: |
020203 distributed computing
business.industry Computer science Advanced Encryption Standard Byte AES implementations 02 engineering and technology Parallel computing Encryption CUDA Computational Theory and Mathematics Hardware and Architecture Logic gate Signal Processing 0202 electrical engineering electronic engineering information engineering business Throughput (business) Block (data storage) |
Zdroj: | IEEE Transactions on Parallel and Distributed Systems. 30:2211-2222 |
ISSN: | 2161-9883 1045-9219 |
DOI: | 10.1109/tpds.2019.2911278 |
Popis: | In this work, a high-throughput bitsliced AES implementation is proposed, which builds upon a new data representation scheme that exploits the parallelization capability of modern multi/many-core platforms. This representation scheme is employed as a building block to redesign all of the AES stages to tailor them for multi/many-core AES implementation. With the proposed bitsliced approach, each parallelization unit processes an unprecedented number of thirty-two 128-bit input data. Hence, a high order of prallelization is achieved by the proposed implementation technique. Based on the characteristics of this new implementation model, the ShiftRows stage can be implicitly handled through input rearrangement and is simplified to the point where its computing process can be neglected. In this implementation, costly Byte-wise operations are performed through register shift and swapping. In addition, the need for look-up table based I/O operations, which are used by the Substitute Bytes stage is eliminated through using S-box logic circuit. The S-box logic circuit is optimized to simultaneously process 32 chunks of 128-bit input data. We develop high-throughput CTR and ECB AES encryption/decryption on 6 CUDA-enabled GPUs, which achieve 1.47 and 1.38 Tbps of encryption throughput on Tesla V100 GPU, respectively. |
Databáze: | OpenAIRE |
Externí odkaz: |