Discrete wavelet transform

(Redirected from Discrete Wavelet Transform)
An example of the 2D discrete wavelet transform that is used in JPEG2000. The original image is high-pass filtered, yielding the three large images, each describing local changes in brightness (details) in the original image. It is then low-pass filtered and downscaled, yielding an approximation image; this image is high-pass filtered to produce the three smaller detail images, and low-pass filtered to produce the final approximation image in the upper-left.

In numerical analysis and functional analysis, a discrete wavelet transform (DWT) is any wavelet transform for which the wavelets are discretely sampled. As with other wavelet transforms, a key advantage it has over Fourier transforms is temporal resolution: it captures both frequency and location information (location in time).

Examples

Haar wavelets

The first DWT was invented by the Hungarian mathematician Alfréd Haar. For an input represented by a list of $2^n$ numbers, the Haar wavelet transform may be considered to simply pair up input values, storing the difference and passing the sum. This process is repeated recursively, pairing up the sums to provide the next scale: finally resulting in $2^n-1$ differences and one final sum.

Daubechies wavelets

The most commonly used set of discrete wavelet transforms was formulated by the Belgian mathematician Ingrid Daubechies in 1988. This formulation is based on the use of recurrence relations to generate progressively finer discrete samplings of an implicit mother wavelet function; each resolution is twice that of the previous scale. In her seminal paper, Daubechies derives a family of wavelets, the first of which is the Haar wavelet. Interest in this field has exploded since then, and many variations of Daubechies' original wavelets were developed.[1]

The Dual-Tree Complex Wavelet Transform (ℂWT)

The Dual-Tree Complex Wavelet Transform (ℂWT) is relatively recent enhancement to the discrete wavelet transform (DWT), with important additional properties: It is nearly shift invariant and directionally selective in two and higher dimensions. It achieves this with a redundancy factor of only $2^d$ for d-dimensional signals, which is substantially lower than the undecimated DWT. The multidimensional (M-D) dual-tree ℂWT is nonseparable but is based on a computationally efficient, separable filter bank (FB).[2]

Others

Other forms of discrete wavelet transform include the non- or undecimated wavelet transform (where downsampling is omitted), the Newland transform (where an orthonormal basis of wavelets is formed from appropriately constructed top-hat filters in frequency space). Wavelet packet transforms are also related to the discrete wavelet transform. Complex wavelet transform is another form.

Properties

The Haar DWT illustrates the desirable properties of wavelets in general. First, it can be performed in $O(n)$ operations; second, it captures not only a notion of the frequency content of the input, by examining it at different scales, but also temporal content, i.e. the times at which these frequencies occur. Combined, these two properties make the Fast wavelet transform (FWT) an alternative to the conventional Fast Fourier Transform (FFT).

Time Issues

Due to the rate-change operators in the filter bank, the discrete WT is not time-invariant but actually very sensitive to the alignment of the signal in time. To address the time-varying problem of wavelet transforms, Mallat and Zhong proposed a new algorithm for wavelet representation of a signal, which is invariant to time shifts.[3] According to this algorithm, which is called a TI-DWT, only the scale parameter is sampled along the dyadic sequence 2^j (j∈Z) and the wavelet transform is calculated for each point in time.[4][5]

Applications

The discrete wavelet transform has a huge number of applications in science, engineering, mathematics and computer science. Most notably, it is used for signal coding, to represent a discrete signal in a more redundant form, often as a preconditioning for data compression. Practical applications can also be found in signal processing of accelerations for gait analysis,[6] in digital communications and many others.[7] [8][9]

It is shown that discrete wavelet transform (discrete in scale and shift, and continuous in time) is successfully implemented as analog filter bank in biomedical signal processing for design of low-power pacemakers and also in ultra-wideband (UWB) wireless communications.[10]

Comparison with Fourier transform

To illustrate the differences and similarities between the discrete wavelet transform with the discrete Fourier transform, consider the DWT and DFT of the following sequence: (1,0,0,0), a unit impulse.

The DFT has orthogonal basis (DFT matrix):

$\begin{bmatrix} 1 & 1 & 1 & 1\\ 1 & -i & -1 & i\\ 1 & -1 & 1 & -1\\ 1 & i & -1 & -i \end{bmatrix}$

while the DWT with Haar wavelets for length 4 data has orthogonal basis in the rows of:

$\begin{bmatrix} 1 & 1 & 1 & 1\\ 1 & 1 & -1 & -1\\ 1 & -1 & 0 & 0\\ 0 & 0 & 1 & -1 \end{bmatrix}$

(To simplify notation, whole numbers are used, so the bases are orthogonal but not orthonormal.)

Preliminary observations include:

• Wavelets have location – the (1,1,–1,–1) wavelet corresponds to “left side” versus “right side”, while the last two wavelets have support on the left side or the right side, and one is a translation of the other.
• Sinusoidal waves do not have location – they spread across the whole space – but do have phase – the second and third waves are translations of each other, corresponding to being 90° out of phase, like cosine and sine, of which these are discrete versions.

Decomposing the sequence with respect to these bases yields:

\begin{align} (1,0,0,0) &= \frac{1}{4}(1,1,1,1) + \frac{1}{4}(1,1,-1,-1) + \frac{1}{2}(1,-1,0,0) \qquad \text{Haar DWT}\\ (1,0,0,0) &= \frac{1}{4}(1,1,1,1) + \frac{1}{2}(1,0,-1,0) + \frac{1}{4}(1,-1,1,-1) \qquad \text{DFT} \end{align}

The DWT demonstrates the localization: the (1,1,1,1) term gives the average signal value, the (1,1,–1,–1) places the signal in the left side of the domain, and the (1,–1,0,0) places it at the left side of the left side, and truncating at any stage yields a downsampled version of the signal:

\begin{align} &\left(\frac{1}{4},\frac{1}{4},\frac{1}{4},\frac{1}{4}\right)\\ &\left(\frac{1}{2},\frac{1}{2},0,0\right)\qquad\text{2-term truncation}\\ &\left(1,0,0,0\right) \end{align}
The sinc function, showing the time domain artifacts (undershoot and ringing) of truncating a Fourier series.

The DFT, by contrast, expresses the sequence by the interference of waves of various frequencies – thus truncating the series yields a low-pass filtered version of the series:

\begin{align} &\left(\frac{1}{4},\frac{1}{4},\frac{1}{4},\frac{1}{4}\right)\\ &\left(\frac{3}{4},\frac{1}{4},-\frac{1}{4},\frac{1}{4}\right)\qquad\text{2-term truncation}\\ &\left(1,0,0,0\right) \end{align}

Notably, the middle approximation (2-term) differs. From the frequency domain perspective, this is a better approximation, but from the time domain perspective it has drawbacks – it exhibits undershoot – one of the values is negative, though the original series is non-negative everywhere – and ringing, where the right side is non-zero, unlike in the wavelet transform. On the other hand, the Fourier approximation correctly shows a peak, and all points are within $1/4$ of their correct value, though all points have error. The wavelet approximation, by contrast, places a peak on the left half, but has no peak at the first point, and while it is exactly correct for half the values (reflecting location), it has an error of $1/2$ for the other values.

This illustrates the kinds of trade-offs between these transforms, and how in some respects the DWT provides preferable behavior, particularly for the modeling of transients.

Definition

One level of the transform

The DWT of a signal $x$ is calculated by passing it through a series of filters. First the samples are passed through a low pass filter with impulse response $g$ resulting in a convolution of the two:

$y[n] = (x * g)[n] = \sum\limits_{k = - \infty }^\infty {x[k] g[n - k]}.$

The signal is also decomposed simultaneously using a high-pass filter $h$. The outputs giving the detail coefficients (from the high-pass filter) and approximation coefficients (from the low-pass). It is important that the two filters are related to each other and they are known as a quadrature mirror filter.

However, since half the frequencies of the signal have now been removed, half the samples can be discarded according to Nyquist’s rule. The filter outputs are then subsampled by 2 (Mallat's and the common notation is the opposite, g- high pass and h- low pass):

$y_{\mathrm{low}} [n] = \sum\limits_{k = - \infty }^\infty {x[k] g[2 n - k]}$
$y_{\mathrm{high}} [n] = \sum\limits_{k = - \infty }^\infty {x[k] h[2 n - k]}$

This decomposition has halved the time resolution since only half of each filter output characterises the signal. However, each output has half the frequency band of the input so the frequency resolution has been doubled.

Block diagram of filter analysis

With the subsampling operator $\downarrow$

$(y \downarrow k)[n] = y[k n]$

the above summation can be written more concisely.

$y_{\mathrm{low}} = (x*g)\downarrow 2$
$y_{\mathrm{high}} = (x*h)\downarrow 2$

However computing a complete convolution $x*g$ with subsequent downsampling would waste computation time.

The Lifting scheme is an optimization where these two computations are interleaved.

This decomposition is repeated to further increase the frequency resolution and the approximation coefficients decomposed with high and low pass filters and then down-sampled. This is represented as a binary tree with nodes representing a sub-space with a different time-frequency localisation. The tree is known as a filter bank.

A 3 level filter bank

At each level in the above diagram the signal is decomposed into low and high frequencies. Due to the decomposition process the input signal must be a multiple of $2^n$ where $n$ is the number of levels.

For example a signal with 32 samples, frequency range 0 to $f_n$ and 3 levels of decomposition, 4 output scales are produced:

Level Frequencies Samples
3 $0$ to ${{f_n}}/8$ 4
${{f_n}}/8$ to ${{f_n}}/4$ 4
2 ${{f_n}}/4$ to ${{f_n}}/2$ 8
1 ${{f_n}}/2$ to $f_n$ 16
Frequency domain representation of the DWT

Relationship to the Mother Wavelet

The filterbank implementation of wavelets can be interpreted as computing the wavelet coefficients of a discrete set of child wavelets for a given mother wavelet $\psi(t)$. In the case of the discrete wavelet transform, the mother wavelet is shifted and scaled by powers of two

$\psi_{j,k}(t)= \frac{1}{\sqrt{2^j}} \psi \left( \frac{t - k 2^j}{2^j} \right)$

where $j$ is the scale parameter and $k$ is the shift parameter, both which are integers.

Recall that the wavelet coefficient $\gamma$ of a signal $x(t)$ is the projection of $x(t)$ onto a wavelet, and let $x(t)$ be a signal of length $2^N$. In the case of a child wavelet in the discrete family above,

$\gamma_{jk} = \int_{-\infty}^{\infty} x(t) \frac{1}{\sqrt{2^j}} \psi \left( \frac{t - k 2^j}{2^j} \right) dt$

Now fix $j$ at a particular scale, so that $\gamma_{jk}$ is a function of $k$ only. In light of the above equation, $\gamma_{jk}$ can be viewed as a convolution of $x(t)$ with a dilated, reflected, and normalized version of the mother wavelet, $h(t) = \frac{1}{\sqrt{2^j}} \psi \left( \frac{-t}{2^j} \right)$, sampled at the points $1, 2^j, 2^{2j}, ..., 2^{N}$. But this is precisely what the detail coefficients give at level $j$ of the discrete wavelet transform. Therefore, for an appropriate choice of $h[n]$ and $g[n]$, the detail coefficients of the filter bank correspond exactly to a wavelet coefficient of a discrete set of child wavelets for a given mother wavelet $\psi(t)$.

As an example, consider the discrete Haar wavelet, whose mother wavelet is $\psi = [1, -1]$. Then the dilated, reflected, and normalized version of this wavelet is $h[n] = \frac{1}{\sqrt{2}} [-1, 1]$, which is, indeed, the highpass decomposition filter for the discrete Haar wavelet transform

Time Complexity

The filterbank implementation of the Discrete Wavelet Transform takes only O(N) in certain cases, as compared to O(N log N) for the fast Fourier transform.

Note that if $g[n]$ and $h[n]$ are both a constant length (i.e. their length is independent of N), then $x * h$ and $x * g$ each take O(N) time. The wavelet filterbank does each of these two O(N) convolutions, then splits the signal into two branches of size N/2. But it only recursively splits the upper branch convolved with $g[n]$ (as contrasted with the FFT, which recursively splits both the upper branch and the lower branch). This leads to the following recurrence relation

$T(N) = 2N + T(N/2)$

which leads to an O(N) time for the entire operation, as can be shown by a geometric series expansion of the above relation.

As an example, the Discrete Haar Wavelet Transform is linear, since in that case $h[n]$ and $g[n]$ are constant length 2.

$h[n] = \left[\frac{-\sqrt{2}}{2}, \frac{\sqrt{2}}{2}\right] g[n] = \left[\frac{\sqrt{2}}{2}, \frac{\sqrt{2}}{2}\right]$

Other transforms

The Adam7 algorithm, used for interlacing in the Portable Network Graphics (PNG) format, is a multiscale model of the data which is similar to a DWT with Haar wavelets.

Unlike the DWT, it has a specific scale – it starts from an 8×8 block, and it downsamples the image, rather than decimating (low-pass filtering, then downsampling). It thus offers worse frequency behavior, showing artifacts (pixelation) at the early stages, in return for simpler implementation.

Code example

In its simplest form, the DWT is remarkably easy to compute.

The Haar wavelet in Java:

public static int[] discreteHaarWaveletTransform(int[] input) {
// This function assumes that input.length=2^n, n>1
int[] output = new int[input.length];

for (int length = input.length >> 1; ; length >>= 1) {
// length = input.length / 2^n, WITH n INCREASING to log(input.length) / log(2)
for (int i = 0; i < length; ++i) {
int sum = input[i * 2] + input[i * 2 + 1];
int difference = input[i * 2] - input[i * 2 + 1];
output[i] = sum;
output[length + i] = difference;
}
if (length == 1) {
return output;
}

//Swap arrays to do next iteration
System.arraycopy(output, 0, input, 0, length << 1);
}
}


Complete Java code for a 1-D and 2-D DWT using Haar, Daubechies, Coiflet, and Legendre wavelets is available from the open source project: JWave. Furthermore, a fast lifting implementation of the discrete biorthogonal CDF 9/7 wavelet transform in C, used in the JPEG 2000 image compression standard can be found here (archived 5th March 2012).

Example of Above Code

An example of computing the discrete Haar wavelet coefficients for a sound signal of someone saying "I Love Wavelets." The original waveform is shown in blue in the upper left, and the wavelet coefficients are shown in black in the upper right. Along the bottom is shown three zoomed-in regions of the wavelet coefficients for different ranges.

This figure shows an example of applying the above code to compute the Haar wavelet coefficients on a sound waveform. This example highlights two key properties of the wavelet transform:

• Natural signals often have some degree of smootheness, which makes them sparse in the wavelet domain. There are far fewer significant components in the wavelet domain in this example than there are in the time domain, and most of the significant components are towards the coarser coefficients on the left. Hence, natural signals are compressible in the wavelet domain.
• The wavelet transform is a multiresolution, bandpass representation of a signal. This can be seen directly from the filterbank definition of the discrete wavelet transform given in this article. For a signal of length $2^N$, the coefficients in the range $[2^{\frac{N}{j+1}}, 2^{\frac{N}{j}}-1]$ represent a version of the original signal which is in the pass-band $\left[ \frac{\pi}{2^j}, \frac{\pi}{2^{j-1}} \right]$. This is why zooming in on these ranges of the wavelet coefficients looks so similar in structure to the original signal. Ranges which are closer to the left (larger $j$ in the above notation), are coarser representations of the signal, while ranges to the right represent finer details.

References

1. ^ Akansu, Ali N.; Haddad, Richard A. (1992), Multiresolution signal decomposition: transforms, subbands, and wavelets, Boston, MA: Academic Press, ISBN 978-0-12-047141-6
2. ^ Selesnick, I.W.; Baraniuk, R.G.; Kingsbury, N.C. - 2005 - The dual-tree complex wavelet transform
3. ^ S. Mallat, A Wavelet Tour of Signal Processing, 2nd ed. San Diego, CA: Academic, 1999.
4. ^ S. G. Mallat and S. Zhong, “Characterization of signals from multiscale edges,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 14, no. 7, pp. 710– 732, Jul. 1992.
5. ^ Ince, Kiranyaz, Gabbouj - 2009 - A generic and robust system for automated patient-specific classification of ECG signals
6. ^ "Novel method for stride length estimation with body area network accelerometers", IEEE BioWireless 2011, pp. 79-82
7. ^ A.N. Akansu and M.J.T. Smith, Subband and Wavelet Transforms: Design and Applications, Kluwer Academic Publishers, 1995.
8. ^ A.N. Akansu and M.J. Medley, Wavelet, Subband and Block Transforms in Communications and Multimedia, Kluwer Academic Publishers, 1999.
9. ^ A.N. Akansu, P. Duhamel, X. Lin and M. de Courville Orthogonal Transmultiplexers in Communication: A Review, IEEE Trans. On Signal Processing, Special Issue on Theory and Applications of Filter Banks and Wavelets. Vol. 46, No.4, pp. 979-995, April, 1998.
10. ^ A.N. Akansu, W.A. Serdijn, and I.W. Selesnick, Wavelet Transforms in Signal Processing: A Review of Emerging Applications, Physical Communication, Elsevier, vol. 3, issue 1, pp. 1-18, March 2010.