Discrete Hartley transform

From Wikipedia, the free encyclopedia
Jump to: navigation, search

A discrete Hartley transform (DHT) is a Fourier-related transform of discrete, periodic data similar to the discrete Fourier transform (DFT), with analogous applications in signal processing and related fields. Its main distinction from the DFT is that it transforms real inputs to real outputs, with no intrinsic involvement of complex numbers. Just as the DFT is the discrete analogue of the continuous Fourier transform, the DHT is the discrete analogue of the continuous Hartley transform, introduced by R. V. L. Hartley in 1942.

Because there are fast algorithms for the DHT analogous to the fast Fourier transform (FFT), the DHT was originally proposed by R. N. Bracewell in 1983 as a more efficient computational tool in the common case where the data are purely real. It was subsequently argued, however, that specialized FFT algorithms for real inputs or outputs can ordinarily be found with slightly fewer operations than any corresponding algorithm for the DHT (see below).


Formally, the discrete Hartley transform is a linear, invertible function H : Rn -> Rn (where R denotes the set of real numbers). The N real numbers x0, ...., xN-1 are transformed into the N real numbers H0, ..., HN-1 according to the formula

H_k = \sum_{n=0}^{N-1} x_n \left[ \cos \left( \frac{2 \pi}{N} n k \right) + \sin \left( \frac{2 \pi}{N} n k \right) \right]
\quad \quad
 k = 0, \dots, N-1 .

The combination \cos(z) + \sin(z) \! = \sqrt{2} \cos(z-\frac{\pi}{4}) is sometimes denoted \mathrm{cas}(z) \!, and should be contrasted with the e^{-iz} = \cos(z) - i \sin(z) \! that appears in the DFT definition (where i is the imaginary unit).

As with the DFT, the overall scale factor in front of the transform and the sign of the sine term are a matter of convention. Although these conventions occasionally vary between authors, they do not affect the essential properties of the transform.


The transform can be interpreted as the multiplication of the vector (x0, ...., xN-1) by an N-by-N matrix; therefore, the discrete Hartley transform is a linear operator. The matrix is invertible; the inverse transformation, which allows one to recover the xn from the Hk, is simply the DHT of Hk multiplied by 1/N. That is, the DHT is its own inverse (involutory), up to an overall scale factor.

The DHT can be used to compute the DFT, and vice versa. For real inputs xn, the DFT output Xk has a real part (Hk + HN-k)/2 and an imaginary part (HN-k - Hk)/2. Conversely, the DHT is equivalent to computing the DFT of xn multiplied by 1+i, then taking the real part of the result.

As with the DFT, a cyclic convolution z = x*y of two vectors x = (xn) and y = (yn) to produce a vector z = (zn), all of length N, becomes a simple operation after the DHT. In particular, suppose that the vectors X, Y, and Z denote the DHT of x, y, and z respectively. Then the elements of Z are given by:

Z_k & = & \left[ X_k \left( Y_k + Y_{N-k} \right)
                         + X_{N-k} \left( Y_k - Y_{N-k} \right) \right] / 2
Z_{N-k} & = & \left[ X_{N-k} \left( Y_k + Y_{N-k} \right)
                         - X_k \left( Y_k - Y_{N-k} \right) \right] / 2


where we take all of the vectors to be periodic in N (XN = X0, etcetera). Thus, just as the DFT transforms a convolution into a pointwise multiplication of complex numbers (pairs of real and imaginary parts), the DHT transforms a convolution into a simple combination of pairs of real frequency components. The inverse DHT then yields the desired vector z. In this way, a fast algorithm for the DHT (see below) yields a fast algorithm for convolution. (Note that this is slightly more expensive than the corresponding procedure for the DFT, not including the costs of the transforms below, because the pairwise operation above requires 8 real-arithmetic operations compared to the 6 of a complex multiplication. This count doesn't include the division by 2, which can be absorbed e.g. into the 1/N normalization of the inverse DHT.)

Fast algorithms[edit]

Just as for the DFT, evaluating the DHT definition directly would require O(N2) arithmetical operations (see Big O notation). There are fast algorithms similar to the FFT, however, that compute the same result in only O(N log N) operations. Nearly every FFT algorithm, from Cooley-Tukey to Prime-Factor to Winograd (Sorensen et al., 1985) to Bruun's (Bini & Bozzo, 1993), has a direct analogue for the discrete Hartley transform. (However, a few of the more exotic FFT algorithms, such as the QFT, have not yet been investigated in the context of the DHT.)

In particular, the DHT analogue of the Cooley-Tukey algorithm is commonly known as the fast Hartley transform (FHT) algorithm, and was first described by Bracewell in 1984. This FHT algorithm, at least when applied to power-of-two sizes N, is the subject of the United States patent number 4,646,256, issued in 1987 to Stanford University. Stanford placed this patent in the public domain in 1994 (Bracewell, 1995).

As mentioned above, DHT algorithms are typically slightly less efficient (in terms of the number of floating-point operations) than the corresponding DFT algorithm (FFT) specialized for real inputs (or outputs). This was first argued by Sorensen et al. (1987) and Duhamel & Vetterli (1987). The latter authors obtained what appears to be the lowest published operation count for the DHT of power-of-two sizes, employing a split-radix algorithm (similar to the split-radix FFT) that breaks a DHT of length N into a DHT of length N/2 and two real-input DFTs (not DHTs) of length N/4. In this way, they argued that a DHT of power-of-two length can be computed with, at best, 2 more additions than the corresponding number of arithmetic operations for the real-input DFT.

On present-day computers, performance is determined more by cache and CPU pipeline considerations than by strict operation counts, and a slight difference in arithmetic cost is unlikely to be significant. Since FHT and real-input FFT algorithms have similar computational structures, neither appears to have a substantial a priori speed advantage (Popovic and Sevic, 1994). As a practical matter, highly optimized real-input FFT libraries are available from many sources (e.g. from CPU vendors such as Intel), whereas highly optimized DHT libraries are less common.

On the other hand, the redundant computations in FFTs due to real inputs are more difficult to eliminate for large prime N, despite the existence of O(N log N) complex-data algorithms for such cases, because the redundancies are hidden behind intricate permutations and/or phase rotations in those algorithms. In contrast, a standard prime-size FFT algorithm, Rader's algorithm, can be directly applied to the DHT of real data for roughly a factor of two less computation than that of the equivalent complex FFT (Frigo and Johnson, 2005). On the other hand, a non-DHT-based adaptation of Rader's algorithm for real-input DFTs is also possible (Chu & Burrus, 1982).

In addition, as a real and symmetric transform, the MD-DHT (multidimensional DHT) is simpler than the DFT. For one, the inverse DHT is identical to the forward transform,

Img DHT prop.png

and second, since the kernel is real, it avoids the computational complexity of complex numbers. Additionally, the DFT is directly obtainable from the DHT by a simple additive operation (Bracewell,1983).

The MD-DHT is widely used in areas like image and optical signal processing. Specific applications include computer vision, high-definition television, and teleconferencing, areas that process or analyze motion images (Zeng,2000). As computing speed keeps increasing, bigger multidimensional problems become computationally feasible, requiring the need for fast multidimensional algorithms. Three such algorithms follow.

The rD-DHT (MD-DHT with "r" dimensions) is given by

X(k_1,k_2,...,k_r)=\sum_{n_1=0}^{N_1-1} \sum_{n_2=0}^{N_2-1} \dots \sum_{n_r=0}^{N_r-1} x(n_1,n_2,...,n_r)cas(\frac{2\pi n_1 k_1}{N_1}+\dots +\frac{2\pi n_r k_r}{N_r})

with k_i = 0,1,\ldots, N_i-1 .

In pursuit of separability, we now consider the following transform (proposed by Bracewell),

\hat{X}(k_1,k_2,...,k_r)=\sum_{n_1=0}^{N_1-1} \sum_{n_2=0}^{N_2-1} \dots \sum_{n_r=0}^{N_r-1} x(n_1,n_2,...,n_r)cas(\frac{2\pi n_1 k_1}{N_1}) \dots cas(\frac{2\pi n_r k_r}{N_r})

It was shown in Bortfeld (1995), that the two can be related by a few additions. For example, in 3-D,

X(k_1,k_2,k_3) = \frac{1}{2} [\hat{X}(k_1,k_2,-k_3)+\hat{X}(k_1,-k_2,k_3)+\hat{X}(-k_1,k_2,k_3)-\hat{X}(-k_1,-k_2,-k_3)].

For \hat{X}, row-column algorithms can then be implemented. This technique is commonly used due to the simplicity of such R-C algorithms, but they are not optimized for general M-D spaces.

Other fast algorithms have been developed, such as radix-2,radix-4, and split radix. For example, Boussakta (2000) developed the 3-D vector radix,

X(k_1,k_2,...,k_r)=\sum_{n_1=0}^{N-1} \sum_{n_2=0}^{N-1}\sum_{n_r=0}^{N-1} x(n_1,n_2,n_3)cas(\frac{2\pi}{N}(n_1 k_1+n_2 k_2 +n_3 k_3))




It was also presented in Boussakta (2000) that a 1-Butterfly whole transform takes (\frac{7}{4})N^3 \log_2 N multiplications and (\frac{31}{8})N^3 \log_2 N additions compared to 3N^3 \log_2 N multiplications and (\frac{9}{2})N^3 \log_2 N+3N^2 additions from the row-column approach. The drawback is that this algorithm is only good for a single dimensional size.

Number theoretic transforms have also been used for solving the MD-DHT, since they perform extremely fast convolutions. In Boussakta (1988), it was shown how to decompose the MD-DHT transform into a form consisting of convolutions:

For the 2-D case (the 3-D case is also covered in the stated reference),

X(k,l)=\sum_{n=0}^{N-1} \sum_{m=0}^{M-1}x(n,m)cas(\frac{2\pi nk}{N}+\frac{2\pi ml}{M}) k=0,1,\ldots ,N-1 , l=0,1,\ldots,M-1

can be decomposed into 1-D and 2-D circular convolutions as follows,

X(k,l)=\begin{cases} X_1(k,0) \\ X_2(0,l) \\ X_3(k,l)\end{cases}


X_1(k,0)=\sum_{n=0}^{N-1} (\sum_{m=0}^{M-1}x(n,m))cas(\frac{2\pi nk}{N}) k=0,1,\ldots,N-1

X_2(0,l)=\sum_{m=0}^{M-1} (\sum_{n=0}^{N-1}x(n,m))cas(\frac{2\pi ml}{M}) l=1,2,\dots, M-1

X_3(k,l)=\sum_{n=0}^{N-1} \sum_{m=0}^{M-1}x(n,m)cas(\frac{2\pi nk}{N}+\frac{2\pi ml}{M})

k=1,2,\ldots, N-1


Developing X_3 further,

X_3(k,l)=\sum_{n=0}^{N-1}x(n,0)cas(\frac{2\pi nk}{N})+\sum_{m=1}^{M-1} x(0,m)cas(\frac{2\pi ml}{M})

+\sum_{n=1}^{N-1} \sum_{m=1}^{M-1}x(n,m)cas(\frac{2\pi nk}{N}+\frac{2\pi ml}{M})

At this point we present the Fermat number transform (FNT). The tth Fermat number is given by F_t=2^b+1, with b=2^t. The well known Fermat numbers are for t=0,1,2,3,4,5,6 (F_t is prime for 0\le t \le 4), (Boussakta, 1988). The fermat number transform is given by

X(k,l)=\sum_{n=0}^{N-1} \sum_{m=0}^{M-1}x(n,m)\alpha_{1}^{nk}\alpha_{2}^{ml} \mod F_t

with k=0,\ldots, N-1, l=0,\ldots,M-1. \alpha_1 and \alpha_2 are roots of unity of order N and M respectively (\alpha_{1}^{N}=\alpha_{2}^{M}=1 \mod F_t).

Going back to the decomposition, the last term for X_3(k,l) will be denoted as X_4(k,l) , then

X_4(k,l)=\sum_{n=1}^{N-1} \sum_{m=1}^{M-1}x(n,m)cas(\frac{2\pi nk}{N}+\frac{2\pi ml}{M})



If g1 and g2 are primitive roots of N and M (which are guaranteed to exist if M and N are prime) then g1 and g2 map (n,m) to (g_1^{n} mod N, g_2^m mod M). So mapping n,m,k and l to g_1^{-n},g_2^{-m}, g_1^k and g_2^l, we get the following,

X_4(g_1^k,g_2^l)=\sum_{n=0}^{N-2} \sum_{m=0}^{M-2}x(g_1^{-n},g_2^{-m})cas(\frac{2\pi g_1^{ (-n+k)}}{N}+\frac{2\pi g_2^{(-m+l)}}{M})



Which is now a circular convolution. With Y(k,l)=X_4(g_1^k,g_2^l), y(n,m)=x(g_1^{-n},g_2^{-m}) , and h(n,m)=cas(\frac{2\pi g_1^n}{N}+\frac{2\pi g_2^m}{M}), we have

Y(k,l)=\sum_{n=0}^{N-2} \sum_{m=0}^{M-2}y(n,m)h(k-n,l-m)

Y(k,l)=FNT^{-1} \{FNT[y(n,m)]\otimes FNT[h(n,m)]

where \otimes denotes term by term multiplication. It was also stated in (Boussakta, 1988) that this algorithm reduces the number of multiplications by a factor of 8-20 over other DHT algorithms at a cost of a slight increase in the number of shift and add operations, which are assumed to be simpler than multiplications. The drawback of this algorithm is the constraint that each dimension of the transform have a primitive root.


  • R. N. Bracewell, "Discrete Hartley transform," J. Opt. Soc. Am. 73 (12), 1832–1835 (19083).
  • R. N. Bracewell, "The fast Hartley transform," Proc. IEEE 72 (8), 1010–1018 (1984).
  • R. N. Bracewell, The Hartley Transform (Oxford Univ. Press, New York, 1986).
  • R. N. Bracewell, "Computing with the Hartley Transform," Computers in Physics 9 (4), 373–379 (1995).
  • R. V. L. Hartley, "A more symmetrical Fourier analysis applied to transmission problems," Proc. IRE 30, 144–150 (1942).
  • H. V. Sorensen, D. L. Jones, C. S. Burrus, and M. T. Heideman, "On computing the discrete Hartley transform," IEEE Trans. Acoust. Speech Sig. Processing ASSP-33 (4), 1231–1238 (1985).
  • H. V. Sorensen, D. L. Jones, M. T. Heideman, and C. S. Burrus, "Real-valued fast Fourier transform algorithms," IEEE Trans. Acoust. Speech Sig. Processing ASSP-35 (6), 849–863 (1987).
  • Pierre Duhamel and Martin Vetterli, "Improved Fourier and Hartley transform algorithms: application to cyclic convolution of real data," IEEE Trans. Acoust. Speech Sig. Processing ASSP-35, 818–824 (1987).
  • Mark A. O'Neill, "Faster than Fast Fourier", Byte 13(4):293-300, (1988).
  • J. Hong and M. Vetterli and P. Duhamel, "Basefield transforms with the convolution property," Proc. IEEE 82 (3), 400-412 (1994).
  • D. A. Bini and E. Bozzo, "Fast discrete transform by means of eigenpolynomials," Computers & Mathematics (with Applications) 26 (9), 35–52 (1993).
  • Miodrag Popović and Dragutin Šević, "A new look at the comparison of the fast Hartley and Fourier transforms," IEEE Trans. Signal Processing 42 (8), 2178-2182 (1994).
  • Matteo Frigo and Steven G. Johnson, "The Design and Implementation of FFTW3," Proc. IEEE 93 (2), 216–231 (2005).
  • S. Chu and C. Burrus, "A prime factor FTT [sic] algorithm using distributed arithmetic," IEEE Transactions on Acoustics, Speech, and Signal Processing 30 (2), 217–227 (1982).
  • Thomas Bortfeld, Wofgang Dinter,"Calculation of Multidimensional Hartley Transforms Using One-Dimensional Fourier Transforms," IEEE Trans. on Signal Processing, 43 (5), 1306-1310 (1995).
  • S. Boussakta, A.G.J. Holt, "Fast Multidimensional Discrete Hartley Transform using Fermat Number Transform," IEEE Proc. 135 (6), 235-237 (1988).
  • S. Boussakta, O. Alshibami, "Fast Algorithm for the 3-D Discrete Hartley Transform," Proc. IEEE ICASSP '00, 4, 2302-2305 (2000).
  • Yonghang Zeng, Guoan Bi, Abdul Rahim Leyman, "Polynomial Transform Algorithms for Multidimensional Discrete Hartley Transform," IEEE Int. Symp. on Circuits and Systems, V, 517-520 (2000).