Bluestein's FFT for arbitrary N on the hypercube
Autor: | Van Emden Henson, William L. Briggs, James S. Otto, Roland A. Sweet, Paul N. Swarztrauber |
---|---|
Rok vydání: | 1991 |
Předmět: |
Cooley–Tukey FFT algorithm
Computer Networks and Communications Bluestein's FFT algorithm Fast Fourier transform Prime-factor FFT algorithm Parallel algorithm Computer Graphics and Computer-Aided Design Theoretical Computer Science Split-radix FFT algorithm Artificial Intelligence Hardware and Architecture Hypercube Hardware_ARITHMETICANDLOGICSTRUCTURES Algorithm Mixed radix Software Mathematics |
Zdroj: | Parallel Computing. 17:607-617 |
ISSN: | 0167-8191 |
DOI: | 10.1016/s0167-8191(05)80051-1 |
Popis: | The original Cooley-Tukey FFT was published in 1965 and presented for sequences with length N equal to a power of two. However, in the same paper they noted that their algorithm could be generalized to composite N in which the length of the sequence was a product of small primes. In 1967, Bergland presented an algorithm for composite N and variants of his mixed radix FFT are currently in wide use. In 1968, Bluestein presented an FFT for arbitrary N including large primes. However, for composite N, Bluestein's FFT was not competitive with Bergland's FFT. Since it is usually possible to select a composite N, Bluestein's FFT did not receive much attention. Nevertheless because of its minimal communication requirements, the Bluestein FFT may be the algorithm of choice on multiprocessors, particularly those with the hypercube architecture. In contrast to the mixed radix FFT, the communication pattern of the Bluestein FFT maps quite well onto the hypercube. With P = 2^d processors, an ordered Bluestein FFT requires 2d communication cycles with packet length N/2P which is comparable to the requirements of a power of two FFT. For fine-grain computations, the Bluestein FFT requires 20log"2N computational cycles. Although this is double that required for a mixed radix FFT, the Bluestein FFT may nevertheless be preferred because of its lower communication costs. For most values of N it is also shown to be superior to another alternative, namely parallel multiplication. |
Databáze: | OpenAIRE |
Externí odkaz: |