Talk:Discrete cosine transform

From Wikipedia, the free encyclopedia
Jump to: navigation, search
WikiProject Mathematics (Rated B+ class, Low-importance)
WikiProject Mathematics
This article is within the scope of WikiProject Mathematics, a collaborative effort to improve the coverage of Mathematics on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
Mathematics rating:
A-BB+ Class
Low Importance
 Field: Applied mathematics
This article has comments.

History behind the term[edit]

Which mathematician (or group of mathematicians) coined this term? What year was it? Was it used before the year 1900? The main page of this article could be improved by giving a brief history behind the term, and what alternative terms were used before this term stuck. 216.99.201.228 (talk) 06:50, 19 September 2010 (UTC)

Change in notation[edit]

I have started converting the notation in this article to match the changes that have been made to Discrete Fourier transform . The four changes are:

1. n becomes N: Change all instances of the symbol n to the new symbol N to represent the length of the sequences in both the time and frequency domains.

2. k becomes n: Change all instances of the symbol k to the new symbol n to represent the index variable for the time domain sequence.

3. j becomes k: Change all instances of the symbol j to the new symbol k to represent the index variable for the frequency domain sequence.

4. f becomes X: Change all instances of the symbol f to the new symobl X to represent the transformation of the data sequence in the frequency domain.

For more information, see Talk:Discrete Fourier transform.

-- Metacomet 16:09, 6 January 2006 (UTC)


I just tried typing in the formula for the DCT 2 and 3 into my compiler to see what would happen. It wasn't inverting properly. It looks like it should be $-\frac{1}{2} x_0$ instead of positive. Adding in the negative sign made it work for me.

Nope, you must have mistyped them, or have a bug in your program. (To see that your proposed minus sign isn't correct, consider the simple case of a transform of all 1's.) —Steven G. Johnson 22:15, 1 September 2006 (UTC)

less algebra, and more explanation please[edit]

This article is pretty poor IMHO. How about explaining what it is without resorting to algebra, and also explaining its applications... --Rebroad 10:21, 19 November 2006 (UTC)

Sorry, if you don't even know algebra you have no hope of understanding a DCT. Don't blame the article for your own limitations. —Steven G. Johnson 20:18, 19 November 2006 (UTC)
Actually I do not entirely agree. I am not a physicist yet I understand what a nuclear weapon is (without understanding its mechanics) and Dr. Hawking has managed to simplify General Relatively for the masses without resorting to advanced mathematics.
It would be nice if some qualified person would provide a better lay summary of DCT to explain what it does without needing to understand necessarily how it specifically does it. For example: DCT is used in lossy image and audio compression by converting bitmaps into frequency.... and so on.
This is not Mathpedia, it is an encyclopedia which is by definition supposed to contain content for the lay person, not serve as an academic textbook. I liken this to a dictionary that skips the definition to go right to usage and philology. Tolstoy143 - "Quos vult perdere dementat" (talk) 21:38, 7 March 2008 (UTC)

DCT for non-uniform distributed data points[edit]

Is it possible to compute a DCT if the data points are non-uniformly distributed? i.e. if I only know the function x(t) values at t=0.0, 0.1, 0.4, 0.5, 0.55, 1.0 or so? I don't want to interpolate to artificial data points to get uniform distributed data points. --89.49.215.47 20:09, 26 November 2006 (UTC) (de:Benutzer:RokerHRO)

Excellent question. First, the answer is yes, you can perform a DCT on data that is non-uniformly distributed. Basically, replace n/N with the actual grid point. But the more important question is, can we do this in a fast transform? There was groundbreaking research in the 1990s that said "yes" (for the case of the non-equispaced Fourier Transform), and hence probably "yes" for the cosine case.
For information on the non-equispaced FFT, google "NFFT". There are now several fast algorithms. Lavaka (talk) 00:10, 25 January 2008 (UTC)

Shifted variants of DCT[edit]

What is impact on resulting DCT if instead of calculating

X_k = \sum_{n=0}^{N-1} x_n \cos \left[\frac{\pi}{N} n k \right]

we calculate X_k = \sum_{n=0}^{N-1} x_n \cos \left[\frac{\pi}{N} (n + \frac{1}{N}) k \right]?
Am I calculating DCT I + \frac{1}{N}?
Is it possible to emphasize importance of DCT I-VIII variants over other shifted DCT?
—The preceding unsigned comment was added by Arkadi kagan (talkcontribs) 13:17, 30 January 2007 (UTC).

Shifting by 1/N does not result in a DCT according to any of the standard definitions, and is not sensible because only shifts by integers or half-integers preserve the sampling points under mirror flips. —Steven G. Johnson 14:19, 30 January 2007 (UTC)
I had to do some homework to get your point:)
e^{j 2 \pi \frac{m n}{N}} is orthogonal set of functions as well as e^{j 2 \pi \frac{(m+\frac{1}{2}) n}{N}}, but not e^{j 2 \pi \frac{(m+\frac{1}{N}) n}{N}} and not e^{j 2 \pi \frac{(m+\frac{1}{4}) n}{N}}
Thanks. Arkadi.
Having an orthogonal basis is not really the critical issue; there are lots of non-orthogonal bases out there that are very important. The reason that a DCT is a useful basis is that it corresponds to a DFT + even symmetry. This even symmetry is broken by the "grid" of sampling points if you don't place your origin at either a sample or halfway between two samples. —Steven G. Johnson 00:46, 1 February 2007 (UTC)

Sorry. I didn`t get you.[edit]

As far as I know orthogonality is the sole property that gives us Inverce DCT. If you know something not based on orthogonality, please point me on.
Relation to DFT gives us to calculate some properties easily. But once it is calculated, it have own uses. For example, signal processing usually works with real numbers.

This even symmetry is broken by the "grid" of sampling points if you don't place your origin at either a sample or halfway between two samples.

Even symetry is one property of DCT. But if some other transform have or have not even symetry what is the influence on DCT discussion? Notice that I am refering to your own definition:

Formally, the discrete cosine transform is a linear, invertible function F : RN -> RN (where R denotes the set of real numbers), or equivalently an N × N square matrix.

Thanks. Arkadi.

A square matrix can be invertible without being orthogonal; the inverse may just be more complicated to compute. And there are infinitely many invertible real matrices, so the fact that it is real is hardly the defining property either. The things that make the DCT valuable to signal processing and partial differential equations and Chebyshev approximation and numerous other applications (its energy compaction property, its diagonalization of convolutions with certain boundary conditions, its properties under differentiation, its relationship to a Fourier cosine series) all stem directly from its relationship to a DFT of even symmetry.

(Indeed, for the normalization choices shown in the article, the DCTs are not orthogonal, although they are related to orthogonal matrices via multiplication by diagonal matrices.)

—Steven G. Johnson 16:58, 1 February 2007 (UTC)

If so, may be I don`t know what is exactly Discrete Cosine Transform.
A square matrix can be invertible without being orthogonal; the inverse may just be more complicated to compute.
In case you have something simple like 2 * I you can know its inverse easily - it`s \frac{1}{2} * I. Is it Discrete Cosine Transform?
Thanks. Arkadi.
No. "Discrete cosine transforms" are specific invertible linear transformations corresponding to DFTs of real-even data. They are not just any one of the infinitely many other possible invertible real matrices, nor are they any one of the infinitely many possible real orthogonal matrices. I don't understand how you could have read what I wrote above to conclude that 2I is a DCT. —Steven G. Johnson 20:57, 3 February 2007 (UTC)

I will not try to prove that 2I is a DCT :)
I want to locate precise boundary: what transform can be called DCT and what transform can`t. What I am seeking for is a precise definition of DCT. When I have such a definition, I can (or can`t) see all the mentioned above properties - either obvious or complex. Arkadi.

The article already has a long discussion of how the 8 types of DCT arise from a DFT of real-symmetric data. —Steven G. Johnson 17:27, 4 February 2007 (UTC)

I think I have recognized the source of my personal ignorance in all this topic. I see that none of DCT I-VIII do not have orthogonal basis function:
cos \left( \frac{2 \pi}{N} n k \right) is indeed orthogonal basis.
However, cos \left( \frac{\pi}{N} n k \right) is not.
Despite this, I do see some superior properties of basis cos \left( \frac{\pi}{N} n k \right). For example, spectrum calculated by DCT I (boundary effect is neglectable here) gives cleaner picture than the same spectrum, calculated by "orthogonal basis" DCT.

Please, help me to understand calculation of Inverse DCT given non-orthogonal basis like cos \left( \frac{\pi}{N} n k \right).
Thanks. Arkadi.

You're incorrect to state that they do not have an orthogonal basis. All of the DCTs I-VIII, appropriately normalized, are orthogonal matrices (i.e. form an orthogonal basis). (The relationship between DCT and DFT is analogous to the relationship between the cosine series and the Fourier series. The factor of 2 in the cosine argument is due to the fact that you can cut the integration interval in half by the even symmetry.) There are many fine textbooks on discrete cosine transforms (e.g. the Rao and Yip book) that you can read to help clear up any misunderstandings—Wikipedia talk pages are not someplace where you can expect to get extensive tutoring. —Steven G. Johnson 19:09, 6 February 2007 (UTC)
After reading this article from a beginners point of view, and despite I have used and implemented various DCT variants, that were used in proiduction priduct form image and audio processing (since 1984!), I must admit that this article is a real nightmare, that just mixes opinions about what are considered the current strategies, and what should be first a simple but solid definition.
This article fails in all areas, is extremely badly structured. It should first present the simple DCT, present how it is computed, what it gives by an example, and then present how it can be inverted. Then the properties should be studied, and later the link to the more general theory of Discret Fourier transforms may be added, as well as variants of the algorithm.
I have never heard the terminology of "even" and "odd" ends used everywhere here without any definition ; I never needed that terminologogy to study DCT algorithms since 20 years, and I bet this comes from an abestract taken from an advanced paper.
Most people won't need that terminology.
And I am completely opposed to your opinion here: Wikipedia should be the place for tutorial. Sending people to other resources first is really not the role of Wikipedia, which is to introduce any subject with the simplest approach, and then add some development in other more detailed articles, and then send them to other external advanced resources.
This articles fails everywhere, it is not made for Wikipedians, it's just a badly written abstract of some external paper!
If I had some better English skills (this is not my native language, although I can understand it without difficulty), I would delete it and restart it completely from scratch, with a much better structure, and much less assumption about readers knowledge ! It is unnecessarily complicated, despite a DCT can be very simple to understand. Thanks, when I learned it, there was still no Internet, no Wikipedia. But academi resources were much better written and much more accessible than this very poor article. 86.221.100.174 21:57, 16 February 2007 (UTC)

Perhaps, I can not agree about orthogonality of base function cos \left( \frac{\pi}{N} n k \right).
We can check this for case N = 2 (or N = 4 if you want).
However, inverse transform can be easily computed given this trick:
In case x_n = x_{2N-n}

\sum_{n=0}^{2N-1} \left[ x_n cos \left( \frac{nk}{2N} 2\pi \right) \right] = 2 \sum_{n=0}^{N-1} \left[ x_n cos \left( \frac{nk}{N} \pi \right) \right].

I suppose, similar conversions are possible for the rest of boundary assumptions.

You have the incorrect limits on the second sum (which should go from 0 to N, for an even extension), and there is no factor of 2 on the n=0 and n=N cases. However, the general idea is in the right direction: all 8 types of DCT can be converted to larger DFTs of real-even data with appropriate symmetries (you can put a sine term into the first sum to get a DFT, because the odd sine terms will sum to zero for even data). ...and because the DFT is unitary (with appropriate normalization), all of the DCT types are orthogonal (with appropriate scaling). Again, however, it seems like you need more extensive tutoring than you can reasonably expect on a Talk page (or from an encyclopedia article). —Steven G. Johnson 17:22, 11 February 2007 (UTC)

Formal definition[edit]

Under the formal definition, it is not stated what k represents. Or have I missed something? --Sam Pablo Kuper 23:41, 4 April 2007 (UTC)

Fixed. —Steven G. Johnson 20:15, 5 April 2007 (UTC)

Would be nice if someone added an example[edit]

Would be nice if someone added an example of how the DCT saves memory, by showing an 8x8 color block and going thru the math. Thanks, Daniel.Cardenas 18:38, 5 April 2007 (UTC)

See JPEG. —The preceding unsigned comment was added by Stevenj (talkcontribs) 20:12, 5 April 2007 (UTC).

n=1?[edit]

Great page, really helped me out. The summation given in the formula for DCT-III starts at n=1, should this be n=0?

No, the n=0 term is handled by a separate term since its normalization is different. —Steven G. Johnson 17:44, 11 April 2007 (UTC)

Misleading example[edit]

The example of the DFT and DCT of the image of the dandelion clock is highly misleading, because the original image was a JPEG-compressed image which will have thrown away information from the high-frequency DCT coefficients - so of course the DCT will have very little energy there! I think a new example should be created based on a genuine, non-lossy-compressed original. Rhebus 11:35, 23 May 2007 (UTC)

it is not exactly correct: JPEG compression removes higher frequencies in each 8x8 block, while the picture shows the transform of the whole image; then JPEG causes the blocking artifact that introduces sharp changes in intensity on the edge of blocks, thus introducing other higher frequency. In any case, I was thinking about re-making the picture using Image:Lichtenstein img processing test.png, since I have used the same picture for other examples (see the description of the picture). You might join the discussion about standard test images on the Electronics wikiproject Alessio Damato 14:25, 23 May 2007 (UTC)
just to be sure I made a test: I have seen the DCT and FFT of a lossless image and of the same image heavily JPEG compressed (10% quality). I have noticed that the compressed picture had a wider spectrum than the original (I think because of the blocking effect). You can try to do it as well with any picture: you can find the Matlab code in the description page of the picture. Alessio Damato 14:33, 23 May 2007 (UTC)
Ah yes, of course you're right that JPEG only locally removes higher frequencies. However I still think that the example does not show what it claims to show, and I'd definitely like to see another. Rhebus 16:21, 24 May 2007 (UTC)
any suggestion?! what's wrong with that, then? :-) Alessio Damato 20:55, 24 May 2007 (UTC)
Well it's my understanding that the image is trying to illustrate that because the DCT is good at compressing the image energy into one small corner of the transform-space, it is useful for image compression. But, in order to show this, you need to demonstrate that DCT does this on an uncompressed image. Otherwise all you demonstrate is that DCT is effective at compressing the energy in images which have already been JPEG-encoded. Rhebus 14:53, 29 May 2007 (UTC)
I also note that the Liechtenstein test image is also JPEG-based. I accept that it isn't a terrible test image, and you have taken steps to mitigate JPEG artifacts, but surely the best test image deals with real, unprocessed data? Rhebus 14:58, 29 May 2007 (UTC)
well, in general I agree that real, unprocessed data have the highest quality, but I'm not sure we really need it for our purpose. I mean: a test image has to have some features (sharp edges, flat surfaces, details), like Liechtenstein, moreover it should have power in any frequency, and I think Liechtenstein has. If you hadn't known the picture was generated from a JPEG, would you have been able to realize it was originally stored as a JPEG?
I could go out for a walk and take a picture with my camera, storing it with TIFF to avoid any artifact, but I don't think it would be a featured picture... do you think we should try to take a picture by ourself to be used as test?
(I think this discussion should move to the wikiproject about electronics, since it's a topic about a more general subject than DCT) Alessio Damato 21:46, 29 May 2007 (UTC)

Orthogonal[edit]

This page seems to mention changing scale factors to make the transform orthogonal. I am fairly sure the transform is orthogonal without the scaling factors, and that the scaling is done to make the transform Orthonormal. —Preceding unsigned comment added by 208.215.227.190 (talk) 18:48, 7 September 2007 (UTC)

Whilst the transform is orthogonal with any scale factor, the matrix is only orthogonal (by the definition given at Orthogonal matrix) if each of its columns is of unit magnitude. Oli Filth 19:17, 7 September 2007 (UTC)

figure doesn't match text[edit]

unless I'm missing something, the figure that lists DCT I, II, III, and IV doesn't match with the text. The text describes the DCT-II as "...where the even-indexed elements are zero. That is, it is half of the DFT of the 4N inputs y_n, where y_{2n}=0 ... ". This matches DCT-III in the figure. Which is correct? BTW, I like the figure a lot, very intuitive. Lavaka (talk) 03:14, 18 November 2007 (UTC)

The part that says "y_{2n}=0" refers to the input that would have to be provided to do an equivalent DFT, whereas the diagram shows the way that the DCT's input is effectively interpreted. Also, note that y_{2n}=0 means "every even element is zero", rather than "the element past the end of the sequence is zero", which is what can be see in the diagram of DCT-III's input (I am assuming that this was your interpretation). My best guess is that the zeros are added in order to allow the mid-point to fall at an integer index. This is required because the boundaries are fractional (n=-1/2 and n=N-1/2). Note that the diagram for DCT-III clearly shows that the boundaries are not fractional - they are the first value of the sequence (n=0), and the value past the end of the sequence (n=N) which is set to zero. --82.18.14.143 (talk) 12:50, 3 June 2008 (UTC)

SA-DCT[edit]

SA-DCT: Shape adaptive DCT. A new section inside DCT or a new article about SA-DCT should be created. Diving hawk (talk)


Default DCT Page[edit]

DCT should go to Discrete Cosine Transform and then mention up top DCT can also refer to DCT File Format. —Preceding unsigned comment added by 203.129.33.32 (talkcontribs) 00:21, 15 April 2008

There are approximately 15 entries on the DCT disambiguation page. There's no reason to suggest that this article is the most prominent or relevant. Oli Filth(talk) 23:26, 14 April 2008 (UTC)

Actually, this seems like a confusion over the disambig pages, of which there are two. OP is looking at the page for "Dct", which has only this article and the DCT File Format. Responder is referring to the page for "DCT", which does in fact have quite a few more entries. Cdecoro (talk) 05:27, 27 January 2009 (UTC)

Serious Problem in Example[edit]

While the example makes a powerful case for the DCT over the DFT, after looking into it in detail I'm afraid that it is irreparably incorrect due to an odd ambiguity in the difference between Matlab's dct and dft functions. Specifically, the dct function is implemented in such a way that is partially normalized, scaling its result by 1/sqrt(2*n). This happens in line 44 of dct.m (Matlab R2008b), and corresponds to what the article terms as the unitary modification. In contrast, the fft function is not implemented in this manner, and instead matches the unnormalized naive DFT. In any case, rather than try to pick apart Matlab's code, it would be better to look at an alternate implementation. I have recreated the example both with FFTW and with a direct implementation of the naive form (I can provide the code on request). Both are equivalent to each other, and appear to match the DFT example shown, but the DCT example shown in the figure is very different, and looks significantly closer to the DFT result. That is to say, the clear advantage of DCT as shown in the example - which was, in fact, what generated me to try to recreate it in the first place - is not at all evident.

In retrospect, there is no reason to assume that the DCT should show the such an improvement: as the FFTW website points out repeatedly, the DCT is equal to a real-even DFT of twice the length. As this article points out, the difference is specifically in the assumptions placed on the behavior outside the sample window (periodic vs. even extension). While this is heuristically useful in 8x8 image blocks, in which case the true extension is likely to resemble an even-mirrored function rather than a periodic one, in this whole-image example a periodic extension would appear to work just as well (all the borders have similar frequencies and colors). Now, I'd like to stress that I don't want to appear as denegrating Mr. Damato's hard work and good intentions. But unless I am wrong (and I am open to that possibility, though after having considered and investigated this for a while I believe it to be an unlikely one) it will be necessary to promptly remove the example. Best, Cdecoro (talk) 02:54, 29 January 2009 (UTC)

If the image is continuous within the image, then a real-even extension will also be continuous, whereas a periodic extension will not in general be continuous at the boundaries, leading in general to better "energy" compaction for the DCT. If the image itself is discontinuous, however, one doesn't expect to see much of a difference; part of the problem with this example may be that it has so many sharp edges from the blades of grass. (For the same reason, JPEG compression is well known to have the hardest time with sharp lines and edges.) To see a really dramatic difference, it would be good to use a fairly smooth (soft-focus?) image, with very different brightness at the boundaries (e.g. sky above and ground below) as you point out. (The nice thing about 8×8 blocks is that, even if the overall image has many sharp edges, normally there will be many smooth sub-blocks within the image.)
I agree that a clearer example would be better, in any case—actually, it might be good to have an example where the DCT is better along with an example where it is about the same (e.g. an image with sharp edges), just to dispel any notion that the DCT is magically always better. —Steven G. Johnson (talk) 17:46, 29 January 2009 (UTC)
You make a very good point about the issues that this image's properties would have on the DCT. But I want to stress that the problem I am mentioning here is not a subjective issue with the presentation of the figure, but specifically that the two subfigures are not showing the same thing, because the DCT image is scaled down by a significant amount. This is therefore not an "apples-to-apples" comparison. Since it is objectively incorrect, it needs to be removed. I would replace it with an image with equal scale factor, but the DFT actually appears to be better than the DCT. Cdecoro (talk) 21:14, 31 January 2009 (UTC)

I should also point out that I believe the division by max(C) in the Matlab code may further exacerbate the problem, as well as the fact that the image is cropped to the lower-left corner, which obscures the fact that the DFT has negative frequencies. But these are minor compared to the larger problem. Can someone please confirm or refute my statement of the problem? (again, I would be happy to be shown wrong, if that is truly the case) If there is no objection, I should erase it. I'd especially like to hear from the original author, btw, if he has the time. Cdecoro (talk) 21:14, 31 January 2009 (UTC)

N != N !!![edit]

All the formulas confusingly use N to refer to both the length of the input and output vectors, but these are typically different. And so what, for example, does it mean to have " PI / N " in the formula??? Which N??? This is VERY confusing -- why not use n/N and k/K which is a much more standard kind of notation and is not confusing.

For DCTs, the input and output vectors are always the same length; otherwise the transform would not be invertible. There is only one N. —Steven G. Johnson (talk) 04:43, 23 June 2009 (UTC)

805 kbyte animated gif[edit]

Er. Do you realize that this article now has a ~1mbyte animated gif inline? Thats seriously not-nice to people with limited bandwidth. --96.241.226.166 (talk) 22:52, 3 April 2011 (UTC)

I would suggest adding a redirect to http://en.wikipedia.org/wiki/Dual-clutch_transmission because that uses the abbreviation DCT. — Preceding unsigned comment added by 108.4.30.149 (talk) 02:26, 17 May 2012 (UTC)

THANK YOU![edit]

Whoever originally wrote and all those who helped improve the informal overview section, you know your stuff and you have a knack for explaining it well! Thank you very much. — Preceding unsigned comment added by 2602:304:47EB:F9F9:21B:77FF:FEAD:46DE (talk) 15:50, 31 August 2012 (UTC)

My formula[edit]

Cosine transform means : if we have signal on [0..T) and we add in interval [T..2T) mirror of this signal (I test, it is the same results if we mirror at zero, interval[-T..T) for -T the same as X0), next we apply _Fourier_transform_ we obtain:
1) spectrum with zero phases (zero imaginary part on FFT), only amplitude elements.
2) if cosine function starting at zero, ending at T - cosine transform get us only one element, rank two times highest than Fourier. Moreover, if we have only cosines different ranks starting at 0, ending at T (no phases), we obtain in cosine transform this maplitudes in even spectrum elements, zero in odd. If no phases, cosine transform not litters on other frequencies. If we have discrete transform, fo example 8 ssmples : x0..x7, x8 must be equal x0, and x9..x15 equal x7..x1. If I compute with plain Fourier this double samples I get 1) and 2)
But formulas DCR I..DCT IV satisfy only 1), not 2)
I derive formula:

X_k=\delta(k)x_0 + \sum_{n=1}^{N-1}x_n cos( \frac{\pi n k}{N})

where \delta(k) is equal 1 for even k, 0 for odd k To normalize with input amplitudes X_k should be divided by N for k=0 and by N/2 for k>0. Borneq (talk) 20:56, 15 November 2013 (UTC)

See WP:OR: Wikipedia is not the right forum to promote changes in the DCT definitions. — Steven G. Johnson (talk) 21:23, 15 November 2013 (UTC)
Really, definition x_0 = x_N is not a discrete cosine transform definition. Borneq (talk) 20:45, 16 November 2013 (UTC)

DCT V-VIII[edit]

"since the corresponding DFT is of length 2(N−1) (for DCT-I) or 4N (for DCT-II/III) or 8N (for DCT-VIII)." I think it should refer to DCT-IV instead of DCT-VIII, from the context. Anyway, I could not verify these lengths. --GuenterRote (talk) 10:08, 29 January 2014 (UTC)

You're right, it should have been DCT-IV.
The equivalencies can be found in many sources. The equivalence for the DCT-I is commented on e.g. Numerical Recipes or in P. Duhamel, “Implementation of ‘split-radix’ FFT algorithms for complex, real, and real-symmetric data,” IEEE Trans. Acoust., Speech, Signal Process., vol. ASSP-34, no. 2, pp. 285–295, 1986. The equivalence for the DCT II was commented on in e.g. M. Vetterli and H. J. Nussbaumer, “Simple FFT and DCT algorithms with reduced number of operations,” Signal Processing, vol. 6, no. 4, pp. 267–278, 1984; see also refs 1 and 6-10 of this paper (also on arxiv). The equivalence for the DCT-IV is less commonly described, but can be found e.g. in this paper (also on arxiv) and refs 1-2 therein. — Steven G. Johnson (talk) 16:06, 30 January 2014 (UTC)

DCT-III is NOT an inverse of DCT-II[edit]

When I perform a DCT-II (aka "the DCT") and then perform its supposed inverse DCT-III (aka "the IDCT") I do NOT get back my original input. It's a HUGE output. If my input waveform (that is in the range of +/- 100) I then transform with a DCT-II, and then immediately transform it back with a DCT-III, I should get the exact waveform back (within rounding errors of the computer), but I don't instead I get a new waveform that is so big it goes out of the range of my waveform display (easily in 1000s or even 1000000s range)! And I made sure my implementation of DCT-II and DCT-III used the EXACT algorithms as shown on the DCT Wikipedia article here. I think something is wrong with your formulas here. Benhut1 (talk) 01:58, 17 March 2014 (UTC)


By the way, this is my implementation in VB6, which I believe is an exact implementation of the DCT-II and DCT-III as shown in this Wikipedia article:

DCT-II code:

   For k = 0 To DCTsize - 1
DCT(k) = 0
For n = 0 To DCTsize - 1
DCT(k) = DCT(k) + WaveForm(n) * Cos(Pi * k / DCTsize * (n + 0.5))
Next n
Next k



DCT-III code:

   For k = 0 To DCTsize - 1
IDCT(k) = DCT(0) / 2
For n = 1 To DCTsize - 1
IDCT(k) = IDCT(k) + DCT(n) * Cos(Pi * n / DCTsize * (k + 0.5))
Next n
Next k

Benhut1 (talk) 04:55, 17 March 2014 (UTC)


Ok I just figured out my problem. I had needed to multiply the output of the DCT-III by 2/DCTsize in order to make it the true inverse of the DCT-II.

Below is my new code for implementing a true IDCT:

   For k = 0 To DCTsize - 1
IDCT(k) = DCT(0) / 2
For n = 1 To DCTsize - 1
IDCT(k) = IDCT(k) + DCT(n) * Cos(Pi * n / DCTsize * (k + 0.5))
Next n
IDCT(k) = IDCT(k) * 2 / DCTsize
Next k

Benhut1 (talk) 05:12, 17 March 2014 (UTC)

Bug in the pseudocode?[edit]

In the line

   IDCT(k) = DCT(0) / 2

Array DCT seems to be not defined inside the function, perhaps the line should be

    IDCT(k) = FreqDomainData(0)  — Preceding unsigned comment added by 84.249.170.164 (talk) 07:48, 17 April 2014 (UTC) 


Ok. I fixed that code now. By the way, your suggested fix was wrong. The proper fix is
    IDCT(k) = FreqDomainData(0) / 2
Benhut1 (talk) 04:22, 21 April 2014 (UTC)