This page explains how the *inverse discrete cosine transform*(IDCT) is
implemented. It follows the proposals of Arai, Agiu and Nakajima^{*} as well as
of Tseng and Miller^{**} completed by the extensions made by
Feig^{***}.

^{*}Y.Arai,T.Agui,M.Nakajima, *A Fast DCT-SQ Scheme for Images*,
Trans. of the IEICE.E 71(11):1095(Nov.1988)

^{**}B.D.Tseng,W.C.Miller, *On Computing the Discrete Cosine Transform*,
IEEE Trans. on Computers, C-27(10):966-8(Oct. 1978)

^{***}E.Feig,E.Linzer, *Discrete Cosine Transform Algorithms for
Image Data Compression*, Proceedings Electronic Imaging '90 East, pp.84-7, Boston,
MA(Oct.29 - Nov.1,1990)

*also published in:*W.B.Pennebaker,J.L.Mitchell,*JPEG Still Image Compression Standard*,
Van Nostrand Reinhold, 1993

The starting point is the formula for the 1 dimensional DCT(1D-DCT):

The suffix `8`

means that the DCT was built by 8 coefficients.

First the 1D-DCT is applied to all rows of the DCT matrix:

Then the 1D-DCT is applied to the columns of the the result of this operation:

By substitution we get the formula of the 2D-DCT:

which proofs: The 2D-DCT can be regarded as a double application of the 1D-DCT. Therefore we first regard the 1D-DCT.

By definition of:

we can rewrite the formula:

For some reason we compute :

Therefore the following is valid:

If we constitute a sequence of elements `f(k), k = 0...15`

with:

we can rewrite the formula:

Because the 16 point *discrete Fourier transform* is defined by

the following relation is valid:

That is the basic idea: Instead of performing an inverse DCT an inverse DFT is performed.
To obtain the DFT values we first multiply the DCT values with .
It seems as this were very inefficient. But bear in mind that the last operation before
the IDCT is the **quantization**. That means every value is to be multiplied
with a certain constant depending on it's position in DCT-matrix. To obtain the DFT values
we simply multiply the quantization constants with .
As a result the quantization and DCT-DFT transformation is performed in one step. The IDFT
actually computes 16 values. But we can simply ignore the higher 8 values *(They are
equal to the first 8 values but in reverse order)*.

To obtain the IDFT equations we first bring the DFT equations in matrix form. If we define a line of the original values by:

a line of the scaled transform values

and a function:
we can establish a Matrix `T`

:_{M}

which transforms the values according to equation `(5)`

:

Because:

the `cos`

computes only `0, 1, -1`

an the positive and
negative values of the following 3 constants:

Therefore the matrix can be rewritten:

To get the matrix equation for IDFT the matrix `T`

must
be inverted:_{M}

The inversion gives:

with:

The matrix `L`

can be factored:

with:

So the IDFT can be performed by 5 steps:

In part `M`

a simplification can be applied:

If you count the operations you'll get 29 additions and 5 multiplications.

The ordinary way is to exploit the horizontal-vertical properties of the DCT (equations `(1)`

-`(4)`

):

First the matrix `L`

is applied to all rows. Then the result is rotated by 90° to
the right and the `L`

is applied once again. Because the 1D-IDFT is applied 16 times
this results in `29 * 16 = 464`

additions and `16 * 5 = 80`

multiplications. The known `libjpeg.a`

uses the horizontal-vertical properties of the DCT wereby some
special cases are regarded.

But Feig^{***} describes an 2D-algorithm with less operations.
The original matrix is re-packed into a one-line-matrix `FF`

by
concatenating all lines:

Then a matrix `KK`

must be found which transforms `FF`

in a
one-line-matrix `XX`

containing all retransformed lines in one line:

The Matrix `KK`

is the *tensor* or *Kronecker* product of
`L`

:

To explain what it means regard the matrices `V`

and _{1}`V`

:_{2}

The *tensor* or *Kronecker* product is:

By means of the following substitution:

the matrix `KK`

can be represented as:

Because of some distributive properties of the tensor product this is the same as:

Now we can decide how to compute the 3 tensor products: according to equation `(7)`

or
by exploiting the horizontal-vertical-properties of the tensor product (equations `(1)-(4)`

).

According to Feig^{***} the current application uses the horizontal-vertical-property for the 1st and last tensor
product. This requires `16 * 26 = 416`

additions.
**Note** that the rotation by 90° isn't a real operation! It is simply a permutation of the indexes.

The middle tensor product is computed according to equation `(7)`

:

**Note** that every `0`

represents a 8x8 zero matrix!

The lines 1,2,4 and 8 are multiplications with the original Matrix `M`

.
We already know this can be performed by 3 additions and 5 multiplications per line:

The lines 3 and 6 are treated the same way extended by a multiplication with `C4`

:

This requires 7 multiplications, 3 additions and 2 shifts to the left.

To transform the values of lines 5 and 7, we build a matrix `J`

only
containing the lines 5 and 7 without the elements 5 and 7:

a submatrix `M'`

of `M`

by discarding both, lines 5 and
7 as well as columns 5 and 7:

The matrix `N`

contains only the values which are both, in lines **and** columns
5 and 7 of matrix `M`

:

The resulting matrix `O`

:

can be computed by:

This gives equations like:

They can be computed similarly to equations`(6)`

by 3 additions
and 3 multiplications.

To compute the 5th and the 7the value of lines 5 and 7 the values
are to be multiplied with the tensor product of `N`

:

If we compute the matrix we get:

But because of:

this results in:

which can be factored:

Using the relation: we get:

This implies 2 multiplications 10 additions and 2 shifts to the left.

If you count the operations for the tensor product of `M`

you'll get:

lines | additions per line | multiplications per line | shifts to the left per line | additions total | multiplications total | shifts to the left total |
---|---|---|---|---|---|---|

1,2,4,8 | 3 | 5 | 0 | 12 | 20 | 0 |

3,6 | 3 | 7 | 2 | 6 | 14 | 4 |

5 and 7 | 6*3+10 | 6*3+2 | 2 | 28 | 20 | 2 |

total | 46 | 54 | 6 |

If we add the 416 additions for the first and the last tensor product we get: 462 additions, 54 multiplications and 6 shifts to the left.

See also An experimental IDCT study by René Stöckel (including a patch to mpeg_play-3.3, which changes the IDCT to IDFT)

Back to

**Java**

-inline-MPEG-Player