# Talk:Cooley–Tukey FFT algorithm

WikiProject Statistics (Rated Start-class, Low-importance)

This article is within the scope of the WikiProject Statistics, a collaborative effort to improve the coverage of statistics on Wikipedia. If you would like to participate, please visit the project page or join the discussion.

Start  This article has been rated as Start-Class on the quality scale.
Low  This article has been rated as Low-importance on the importance scale.
WikiProject Mathematics (Rated Start-class, Low-importance)
This article is within the scope of WikiProject Mathematics, a collaborative effort to improve the coverage of Mathematics on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
Mathematics rating:
 Start Class
 Low Importance
Field: Applied mathematics

## Visually onfusing use of e

It seems silly to use e-sub-m to denote the outputs of the sub-transform performed on the even elements of the original sequence, given that e stands for the base of the natural logarithm, in which role it appears in the same term where e-sub-m is used!

If your programming language allowed you to have a scalar e and an unrelated array e in the same lexical scope, would you do it? :)

## Weighting

Weighting is not trivial: beautiest way to do this is by dividing each f by ${\displaystyle sqrt(2)}$ in each step. Inverse transform is done by simply replacing the negative exponents with a positive one. --152.66.224.133 14:52, 20 May 2004 (UTC)lorro, 2004.05.20.

Weighting is generally trivial, at least in the common case of floating-point FFTs, since it's just an overall scaling. The √2 trick only works for radix-2 algorithms—moreover, it's not as efficient as doing the scaling all at once. However, the most efficient thing is for the user to simply absorb the scale factor into some other computation, which can typically be done at little or no cost. For example, FFTs are commonly used for convolution with a fixed kernel, so you can just abosrb the 1/n normalization into the kernel when you construct it.
I suspect that you're worrying about the fixed-point case, where indeed the scaling is non-trivial and has to be done at each intermediate stage, and the optimal scaling depends upon the data. However, this would be better discussed in the main fast Fourier transform article since the same principles apply to all FFTs, or (if someone is inspired to write a detailed discussion) on a fixed-point FFT algorithms sub-page linked to from that article. The algorithm-specific sub-pages like this one are about the abstract factorization of the Fourier matrix and the computational ordering; arithmetic issues are dealt with on the main FFT page. —Steven G. Johnson 17:31, May 20, 2004 (UTC)
I've edited the main fast Fourier transform to mention fixed-point scaling issues, in the Accuracy and approximations section. —Steven G. Johnson 19:37, May 20, 2004 (UTC)

## Neo Latin

Gauss write the article in Neo Latin, because he didn't know anything of Chinesse, if in his hands were he would have writen the article in this lenguage, is a coincidence at his times, not because he could not learn Chinesse, we all know that he can. — Preceding unsigned comment added by 190.207.17.229 (talk) 01:28, 26 October 2012 (UTC)

## Simpler way?

There's not.. um.. a more simple way of describing this, is there? For "laymen"? I understand Fourier transforms fine, have used FFTs plenty of times and used them to speed up computations of things like the Hilbert transform, but I really don't understand this article. Basically it sounds to me like they have created this formula by which you take, for instance, a 64-point DFT and break it down into 2 32-point DFTs, then you break those down into 4 16-point DFTs, and so on, until you are left with just 32 two-point DFTs to calculate? And then you calculate them and work your way back up, reassembling the results? Are there any graphical analogies? What is this "butterfly" they mention? - Omegatron 17:49, July 10, 2005 (UTC)

Butterfly_(FFT_algorithm) - look there. Fresheneesz 00:39, 24 April 2006 (UTC)
This article needs rewriting. It needs to be clear what is derivation and what is the final product. I'd make some butterfly diagrams myself, but I don't understand how to apply them to the messy math thats given. Fresheneesz 01:24, 24 April 2006 (UTC)
The derivation here is totally standard, and is not much different from what you would find in any textbook on the subject. The final product is simply to show how a DFT can be broken into smaller DFTs based on its factors. This is pointed out numerous times in the article. Can you be more specific and constructive? —Steven G. Johnson 05:37, 2 October 2006 (UTC)
The derivation may be standard, but it is lacking. It might be found in a textbook on the subject, but many math textbooks are written in a way that is very esoteric and may not be the best standard to strive for. A lot of the variables that appear in the formulae are not well-defined prior to their use. For someone very familiar with the derivation or use of the FFT in this context, the derivation and article may be as clear as day, but for the layman or for someone far removed from the derivation, it's difficult to follow. A well illustrated example would probably go a long way towards clearing up the algorithm. Jamesfett (talk) 18:46, 19 May 2009 (UTC)
Which variables aren't defined? What example would you suggest? Realize that this isn't an article on the meaning or application of the transform (see discrete Fourier transform for that) but simply on how the summation can be computed, given the definition. —Steven G. Johnson (talk) 01:24, 20 May 2009 (UTC)
A layperson who doesn't understand summation notation or complex exponentials simply has no hope of understanding this FFT algorithm in detail without doing a lot of reading outside this article first. That's an impossible standard to request here. The most we can provide for a layperson is a description of the impact and significance of the algorithm, the general idea in handwavy terms (you break a DFT into smaller DFTs, recursively, via the factors of N), and the historical background. —Steven G. Johnson (talk) 01:24, 20 May 2009 (UTC)

It seems that it might actually be *much* clearer if the sigma notation were turned into (1st, ellipses, last) format. I'll do this in a few days myself if there are no objections. 68.163.184.132 03:39, 2 October 2006 (UTC)

Please don't. The Σ notation is not only completely standard, consistent with numerous other articles (discrete Fourier transform etc.), more readable for complicated sums, and much more compact than ellipses, it is also the only reasonable way to show nested summations (which are required to describe the general Cooley-Tukey algorithm). If a reader doesn't know what Σ means then they have no hope of understanding the other math; such a reader should stick with the English description in the introduction. —Steven G. Johnson 05:27, 2 October 2006 (UTC)

## picture

I think an 8 point picture of the butterfly network would help the reader. Does anyone have one that is fair use? Jdufresne 01:01, 5 March 2006 (UTC)

Why don't we make one? I think the butterfly diagrams would help alot. Fresheneesz 00:39, 24 April 2006 (UTC)

## Source Code

Here's a simple implementation in Python based on the description in the article (which I thought was quite good). I spent a lot of time looking for a quick little implementation like this for python but all I could find was heavy mathematical libraries with lots of baggage.

from cmath import *
def fft(x):
N=len(x)
if N==1: return x

even=fft([x[k] for k in range(0,N,2)])
odd= fft([x[k] for k in range(1,N,2)])

M=N/2
l=[ even[k] + exp(-2j*pi*k/N)*odd[k] for k in range(M) ]
r=[ even[k] - exp(-2j*pi*k/N)*odd[k] for k in range(M) ]

return l+r



—The preceding unsigned comment was added by 24.77.42.65 (talk) 03:13, 15 March 2007 (UTC).

## first use of algorithm

I was a radio-astronomy student for a time in the Cavendish, in the late 60s/early 70s. I remember one of the great duo giving us a talk and telling us how the Cambridge University Maths Lab guys had actually implemented the algorithm independently, for use by the radio-astronomers, well before its publication by CT. Linuxlad (talk) 20:48, 10 March 2009 (UTC)

The article already mentions the fact that the algorithm was discovered at least as early as Gauss in 1805, as well as by several subsequent re-inventors. The name "Cooley-Tukey algorithm" has stuck anyway (and not entirely without justification: Cooley & Tukey made a significant contribution by clearly describing and analyzing the algorithm and showing the N log N complexity). —Steven G. Johnson (talk) 21:09, 10 March 2009 (UTC)
I see only Gauss in the intro./history. I doubt that the maths lab guys would have put forward their algorithm without being away of its key efficiency for large transforms (the RA Group FFTs and later the MRC Xray diffraction inversions where for many years the major users of the Cambridge machines). I recollect CT also said that one of the US radar (?) facilities (Lincoln Labs???) had an implementation in hardware without being aware it reproduced the CT algorithm. Bob aka Linuxlad (talk) 13:08, 11 March 2009 (UTC)
The intro also says that the algorithm was rediscovered multiple times in the 19th and 20th centuries, and cites a nice article by Heideman with more information. Why is your anecdote about Cambridge more important than all the others? As for your "doubt", the same algorithm was discovered multiple times without anyone (even Gauss) pointing out that it was N log N. People noticed that it was more efficient, certainly (Gauss wrote that it "greatly reduces the tedium of mechanical calculation" if I recall correctly), but that's not the same as quantifying the improvement. Anyway, without a reputable published source for your anecdote, this discussion is pointless. —Steven G. Johnson (talk) 20:10, 11 March 2009 (UTC)

Well Gauss discover a formula who predicts how many prime numbers are smaller than a given number N, as you may guess the formula is N/log(N), is quite similar!!!!, later Gauss was proven right. — Preceding unsigned comment added by 190.207.17.229 (talk) 02:28, 26 October 2012 (UTC)

## External References

I have just tried adding a link to a blog post of my own, but was automatically reverted, describing a python implementation of the Cooley-Tukey algorithm for general factorizations. Most sources on the web seem to concentrate on the radix 2 FFT, be the decimation in time or frequency, so it is hard to get a feeling of how anything else really works or is implemented. I felt (and still feel) this to be a worthwhile addition. Wouldn't mind rewriting it for wikipedia, if the problem is linking a personal site, but would like some pointers on how to proceed. Jfrio (talk) 11:38, 29 May 2009 (UTC)

## Running time

"0.02 minutes"? Why not 1.2 seconds? The original paper may have quoted the time in this unusual fashion, but I think we'd be justified in using more conventional units in the article. Tevildo (talk) 23:48, 8 December 2009 (UTC)

## Differences in the Cooley-Tukey Methods within Mathematics Tools

The definition presented here is consistant with the definition in my undergraduate college textbook (Lathi), but both definitions seem to be inconsistant with how the mathamatics tools I use implement them. For example, MATHCAD uses the 1/N factor on the FFT, but not on the IFFT. It also does not produce the first value since the summation starts from 1 instead of from 0. On the other hand, MATLAB produces the first value and uses a factor of 1 on the FFT and the 1/N factor on the IFFT like here, but it uses a positive sign in the exponent in the exponential of the FFT rather than a negative sign, and the negative sign is used in the IFFT. So it is reversed. My questions are these: Are these variations simply wrong, or are they as correct as what is presented here and just dependant upon how they were derived? It seems that the Cooley-Tukey Method should only be one way. Am I wrong? Also, what is the impact of using these different tools and/or methods? I would suspect that cycling through the MATLAB version (i.e., time domain sampled function --> transform --> time domain sampled function) should give the same result as the version presented here. But what about the transforms while in the frequency domain? Is it comparing apples-to-apples? And then for the MATHCAD version, I suspect that this one would have to have some additional operations on it to yeild the same results as the other two, like adding back in the first term and changing the 1/N weighting by multiplying thorugh by N. Am I wrong in assuming the results would be very different? Why do they define it so differently across the board? I would think that a company selling a toolbox like this would want to stay consistant with what the math/engineering community expects. Does anyone have any insight into this? Thanks in advance. J.G. Lindley (talk) 21:48, 21 March 2011 (UTC)

## Normal vs. Reversed Bit-order/ DIT vs. DIF

In the section on input ordering, it is stated that the defining difference between DIT and DIF algorithms is the order of the inputs and outputs. This is not true. The DIT algorithm can be implemented with either the inputs or outputs in normal order. Same for the DIF algorithm. DIT applies the twiddle factor to both arms in the butterfly (twiddle factor is applied to odd component in the flowgraph before it is summed with the even), whereas with DIF, the twiddle factor is applied after the sum takes place. (the 2pt butterfly elements are different). See Richard Lyons' "Understanding Digital Signal Processing" 1997 edition pg 151. 128.59.159.194 (talk) 08:00, 4 December 2011 (UTC)

The definition of DIT and DIF are given in an earlier section (General Factorizations): DIT is when N1 is the radix, and DIF is when N2 is the radix, in the notation of that section, which is essentially equivalent to the definition you cite. This comes less from the placement of the twiddle factors, however, than from the meaning of the name "decimation": going back to the original meaning of "decimation" (in which every 10th Roman soldier was executed), a radix-r decimation time takes every r-th input in time domain for a subproblem, while radix-r decimation in frequency subdivides the problem by taking every r-th output in frequency domain. But you are right that the ordering section should be clearer that they could be implemented in either way. — Steven G. Johnson (talk) 17:35, 4 December 2011 (UTC)

## Clarity

I know its not easy to write articles on complex mathematical subjects in the tone of a general encyclopedia, but I can't make heads or tails of what is going on in here. A few additional paragraphs putting this in layman's terms would go a long way. ThemFromSpace 06:46, 14 December 2012 (UTC)

What, specifically, in the topic paragraph, would you want clarified for a non-technical audience? (FFT algorithms and discrete Fourier transforms already have their own articles, so this is not the right place to introduce those topics.) — Steven G. Johnson (talk) 14:14, 12 March 2013 (UTC)

I have to agree with Themfromspace, this is one of the typical examples of maths/science articles in Wikipedia which fails across the board at sharing knowledge to non-maths/science experts. Although it may be accurate, the maths short-hand and the obscure references to other concepts that the author(s) assume are known to the reader mean that it is like struggling through a barrel of treacle to understand what is going on. I have just put together an FFT package in C++ for transforming WAV data into frequency data and it works, but my what a struggle to accumulate the information from various places online and piece together very patiently the actual process. I belive that an encyclopedia is meant primarily to share knowledge, not obfusticate it and condem it to the realms of references for people who are already experts in their field. Along with phrases such as "This is hard" or "very complicated process", maths shorthand in large blocks should be avoided absolutely unless it is provided with a key and can also be talked through in plain english. I hope that maths and science articles in Wikipedia improve dramatically in the future. I feel that this article is rich in information, but arrogant in it's delivery. And I do not wish to be rude. But I am (and I am sure others are) frustrated. - Sam, UK — Preceding unsigned comment added by 46.233.70.199 (talk) 10:15, 17 January 2014 (UTC)

## O(N log log N) time when r=sqrt(N)?

Shouldn't the time complexity be O(N log log N) instead of O(N log N) when the radix, r, is always chosen to be sqrt(N)? With that choice of r, there are clearly log_2(log_2(N)) levels, (since it takes exactly log_2(log_2(N)) times taking the square root of N to reach the value 2), and each level clearly does O(N) work, (where N is the total N, not the N at a given level), by multiplying the twiddles to every number twice and adding in each number twice, (twice because first all of the rows are FFT'd and then all of the columns are FFT'd). Of course, sqrt(N) is only exact each time if N is of the form 2^(2^k), such as 4, 16, 256, 65,536, or 4,294,967,296, but the article specifically mentions that it takes O(N log N) time in the case when r=sqrt(N). This seems to conflict. Ndickson (talk) 04:44, 27 September 2013 (UTC) (Edited: Added missing 256 = 2^(2^3) to list.) Ndickson (talk) 05:00, 27 September 2013 (UTC)

I am expecting to add a code block to the Pseudocode section of Cooley-Tukey FFT algorithm based on my naive c++ implementation. I have not written pseudocode before and would like some other users to vet it before I submit it in the article. If a user thinks it is not a good idea to add such to this section please explain.

This is the current development of my pseudocode. Users are invited to make edits using the same guidelines as apply to articles.

x0,...,N−1 ← diffft2(x, N):             DFT of {x0, ...,xN-1}:
φT ← eπi/N
for n ∈ {2-k × N : k ∈[1, log2(N))}
φT ← φT × φT
T ← conj(φT)
for a ∈ [0, n/2)
T ← φT × T
for b ∈ {a + kn : k ∈ [0, (N/n))}
t ← xb - xb+n/2
xb ← xb + xb+n/2
xb+n/2 ← T × t


Sorry, this is all the effort I have today. I think this approach has a lot of advantages.

1. It operates in place with only five registers.
2. There is only a single computation of the complex exponential in the entire procedure.
3. It does not compute twiddle factors with negative imaginary parts.
4. It does not twice compute any factors in the inner two loops.
5. There are no statements which follow loops (total branch prediction certainty).
6. The innermost loop can be executed in full parallel.
7. The second nested loop can be configured for execution in any synchronous order and some asynchronous ones.
8. There is no enveloping logic (predictable performance).

This is a copy of my c++ implementation. It requires annotation so I apologize.

unsigned int reverse(unsigned int x, unsigned int maxbits)
{
x = (((x & 0xaaaaaaaa) >> 1) | ((x & 0x55555555) << 1));
x = (((x & 0xcccccccc) >> 2) | ((x & 0x33333333) << 2));
x = (((x & 0xf0f0f0f0) >> 4) | ((x & 0x0f0f0f0f) << 4));
x = (((x & 0xff00ff00) >> 8) | ((x & 0x00ff00ff) << 8));
return ((x >> 16) | (x << 16)) >> (32 - maxbits);
}

void decimate(vector<complex<double>> &input)
{
unsigned int N = input.size(), m = (unsigned int)log2(N);
for (unsigned int x = 0; x < input.size(); x++)
{
unsigned int y = reverse(x, m);
if (y > x)
{
complex<double> t = input[x];
input[x] = input[y];
input[y] = t;
}
}
}

void fft(vector<complex<double>> &x, bool normalize)
{
unsigned int N = x.size(), k = N, n;
complex<double> phiT = 3.14159265358979 / N, T;
phiT = complex<double>(cos(phiT.real()), sin(phiT.real()));
while (k > 1)
{
n = k;
k >>= 1;
phiT = phiT * phiT;
T.real(phiT.real());
T.imag(-phiT.imag());
for (unsigned int l = 0; l < k; l++)
{
T *= phiT;
for (unsigned int a = l; a < N; a += n)
{
unsigned int b = a + k;
complex<double> t = x[a] - x[b];
x[a] += x[b];
x[b] = t * T;
}
}
}
if (normalize)
{
complex<double> Factor = 1.0 / sqrt(N);
for (int i = 0; i < N; i++)
x[i] *= Factor;
}
}


184.17.219.58 (talk) 16:03, 30 June 2015 (UTC)

input : ${\displaystyle x\in \mathbb {C} ^{n}}$ where ${\displaystyle n}$ is a power of ${\displaystyle 2}$, output : ${\displaystyle X\in \mathbb {C} ^{n}}$ its discrete Fourier transform (DFT)

${\displaystyle X(k)=\sum _{n=0}^{N-1}x(n)e^{-2i\pi nk/N}}$

the naive algorithm takes ${\displaystyle N^{2}}$ multiplications/additions but we will do it in ${\displaystyle 2N\ln N}$ operations by a divide and conquer strategy

transform ${\displaystyle x(n)}$ into two downsampled vectors

${\displaystyle y_{0}(n)=x(2n)\qquad \qquad y_{1}(n)=x(2n+1)}$

recursively apply the algorithm to calculate the DFT of those two vectors so that

${\displaystyle Y_{0}(k)=\sum _{n=0}^{N/2-1}y_{0}(n)e^{-2i\pi nk/(N/2)}\qquad \qquad Y_{1}(k)=\sum _{n=0}^{N/2-1}y_{1}(n)e^{-2i\pi nk/(N/2)}}$

${\displaystyle N/2}$-periodize the two obtained DFT so that ${\displaystyle Y_{0}(k)=Y_{0}(k+N/2)}$ and ${\displaystyle Y_{1}(k)=Y_{1}(k+N/2)}$, then mix them to obtain the whole DFT

${\displaystyle {\begin{array}{lll}Y_{0}(k)+e^{-2i\pi k/N}Y_{1}(k)&=&\sum _{n=0}^{N/2-1}x(2n)e^{-2i\pi nk/(N/2)}+e^{-2i\pi k/N}\sum _{n=0}^{N/2-1}x(2n+1)e^{-2i\pi nk/(N/2)}\\&=&\sum _{n=0}^{N/2-1}x(2n)e^{-2i\pi 2nk/N}+e^{-2i\pi k/N}\sum _{n=0}^{N/2-1}x(2n+1)e^{-2i\pi 2nk/N}\\&=&X(k)\end{array}}}$

78.227.78.135 (talk) 18:27, 15 January 2016 (UTC)

When you have finished reviewing my changes, please set the checked parameter below to true or failed to let others know (documentation at {{Sourcecheck}}).