Cross-correlation: Difference between revisions

Content deleted Content added

Inline

Revision as of 07:10, 13 November 2011

In signal processing, cross-correlation is a measure of similarity of two waveforms as a function of a time-lag applied to one of them. This is also known as a sliding dot product or sliding inner-product. It is commonly used for searching a long-duration signal for a shorter, known feature. It also has applications in pattern recognition, single particle analysis, electron tomographic averaging, cryptanalysis, and neurophysiology.

For continuous functions, f and g, the cross-correlation is defined as:

(f\star g)(t)\ {\stackrel {\mathrm {def} }{=}}\int _{-\infty }^{\infty }f^{*}(\tau )\ g(t+\tau )\,d\tau ,

where f * denotes the complex conjugate of f.

Similarly, for discrete functions, the cross-correlation is defined as:

(f\star g)[n]\ {\stackrel {\mathrm {def} }{=}}\sum _{m=-\infty }^{\infty }f^{*}[m]\ g[n+m].

The cross-correlation is similar in nature to the convolution of two functions.

In an autocorrelation, which is the cross-correlation of a signal with itself, there will always be a peak at a lag of zero unless the signal is a trivial zero signal.

In probability theory and statistics, correlation is always used to include a standardising factor in such a way that correlations have values between −1 and +1, and the term cross-correlation is used for referring to the correlation corr(X, Y) between two random variables X and Y, while the "correlation" of a random vector X is considered to be the correlation matrix (matrix of correlations) between the scalar elements of X.

If $X$ and $Y$ are two independent random variables with probability density functions f and g, respectively, then the probability density of the difference $Y-X$ is formally given by the cross-correlation (in the signal-processing sense) $f\star g$ ; however this terminology in not used in probability and statistics. In contrast, the convolution $f*g$ (equivalent to the cross-correlation of f(t) and g(−t) ) gives the probability density function of the sum $X+Y$ .

Explanation

For example, consider two real valued functions $f$ and $g$ differing only by an unknown shift along the x-axis. One can use the cross-correlation to find how much $g$ must be shifted along the x-axis to make it identical to $f$ . The formula essentially slides the $g$ function along the x-axis, calculating the integral of their product at each position. When the functions match, the value of $(f\star g)$ is maximized. This is because when peaks (positive areas) are aligned, they make a large contribution to the integral. Similarly, when troughs (negative areas) align, they also make a positive contribution to the integral because the product of two negative numbers is positive.

With complex-valued functions $f$ and $g$ , taking the conjugate of $f$ ensures that aligned peaks (or aligned troughs) with imaginary components will contribute positively to the integral.

In econometrics, lagged cross-correlation is sometimes referred to as cross-autocorrelation^[1]

Properties

The cross-correlation of functions f(t) and g(t) is equivalent to the convolution of f *(−t) and g(t). I.e.:

f\star g=f^{*}(-t)*g.

If either f or g is Hermitian, then: $f\star g=f*g.$

$(f\star g)\star (f\star g)=(f\star f)\star (g\star g)$

Analogous to the convolution theorem, the cross-correlation satisfies:

{\mathcal {F}}\{f\star g\}=({\mathcal {F}}\{f\})^{*}\cdot {\mathcal {F}}\{g\},

where ${\mathcal {F}}$ denotes the Fourier transform, and an asterisk again indicates the complex conjugate. Coupled with fast Fourier transform algorithms, this property is often exploited for the efficient numerical computation of cross-correlations. (see circular cross-correlation)

The cross-correlation is related to the spectral density. (see Wiener–Khinchin theorem)

The cross correlation of a convolution of f and h with a function g is the convolution of the correlation of f and g with the kernel h:

(f*h)\star g=h(-)*(f\star g)

Normalized cross-correlation

For image-processing applications in which the brightness of the image and template can vary due to lighting and exposure conditions, the images can be first normalized. This is typically done at every step by subtracting the mean and dividing by the standard deviation. That is, the cross-correlation of a template, $t(x,y)$ with a subimage $f(x,y)$ is

{\frac {1}{n-1}}\sum _{x,y}{\frac {(f(x,y)-{\overline {f}})(t(x,y)-{\overline {t}})}{\sigma _{f}\sigma _{t}}}

.

where $n$ is the number of pixels in $t(x,y)$ and $f(x,y)$ , ${\overline {f}}$ is the average of f and $\sigma _{f}$ is standard deviation of f. In functional analysis terms, this can be thought of as the dot product of two normalized vectors. That is, if

F(x,y)=f(x,y)-{\overline {f}}

and

T(x,y)=t(x,y)-{\overline {t}}

then the above sum is equal to

\left\langle {\frac {F}{\|F\|}},{\frac {T}{\|T\|}}\right\rangle

where $\langle \cdot ,\cdot \rangle$ is the inner product and $\|\cdot \|$ is the L² norm. Thus, if f and t are real matrices, their normalized cross-correlation equals the cosine of the angle between the unit vectors F and T, being thus 1 if and only if F equals T multiplied by a positive scalar.

Normalized correlation is one of the methods used for template matching, a process used for finding incidences of a pattern or object within an image. It is in fact just the 2-dimensional version of Pearson product-moment correlation coefficient.

Time series analysis

In time series analysis, as applied in statistics, the cross correlation between two time series describes the normalized cross covariance function.

Let $(X_{t},Y_{t})$ represent a pair of stochastic processes that are jointly wide sense stationary. Then the cross covariance is given by ^[2]

\gamma _{xy}(\tau )=\operatorname {E} [(X_{t}-\mu _{x})(Y_{t+\tau }-\mu _{y})],

where $\mu _{x}$ and $\mu _{y}$ are the means of $X_{t}$ and $Y_{t}$ respectively.

The cross correlation function $\rho _{xy}$ is the normalized cross-covariance function.

\rho _{xy}(\tau )={\frac {\gamma _{xy}(\tau )}{\sigma _{x}\sigma _{y}}}

where

\sigma _{x}

and

\sigma _{y}

are the standard deviations of processes

X_{t}

and

Y_{t}

respectively.

Note that if $X_{t}=Y_{t}$ for all t, then the cross correlation function is simply the autocorrelation function.

References

^ Campbell, Lo, and MacKinlay 1996: The Econometrics of Financial Markets, NJ: Princeton University Press.
^ von Storch, H. (2001). Statistical analysis in climate research. Cambridge Univ Pr. ISBN 0521012309. {{cite book}}: Unknown parameter |coauthors= ignored (|author= suggested) (help)

External links

[1] Campbell, Lo, and MacKinlay 1996: The Econometrics of Financial Markets, NJ: Princeton University Press.

[2] von Storch, H. (2001). Statistical analysis in climate research. Cambridge Univ Pr. ISBN 0521012309. {{cite book}}: Unknown parameter |coauthors= ignored (|author= suggested) (help)

[1]

[2]

@@ Line 17: / Line 17: @@
 In [[probability theory]] and [[statistics]], ''correlation'' is always used to include a standardising factor in such a way that correlations have values between &minus;1 and +1, and the term '''cross-correlation''' is used for referring to the [[covariance and correlation|correlation]] corr(''X'',&nbsp;''Y'') between two [[random variables]] ''X'' and ''Y'', while the "correlation" of a random vector ''X'' is considered to be the [[correlation matrix]] (matrix of correlations) between the scalar elements of ''X''.
-If <math>X</math> and <math>Y</math> are two [[independent (probability)|independent]] [[random variable]]s with [[probability density function]]s ''f'' and ''g'', respectively, then the probability density of the difference <math>Y - X</math> is formally given by the cross-correlation (in the signal-processing sense) <math>f \star g</math>; however this terminology in not used in probability and statistics. In contrast, the [[convolution]] <math>f \star g</math> (equivalent to the cross-correlation of ''f''(''t'') and ''g''(&minus;''t'') ) gives the probability density function of the sum <math>X + Y</math>.
+If <math>X</math> and <math>Y</math> are two [[independent (probability)|independent]] [[random variable]]s with [[probability density function]]s ''f'' and ''g'', respectively, then the probability density of the difference <math>Y - X</math> is formally given by the cross-correlation (in the signal-processing sense) <math>f \star g</math>; however this terminology in not used in probability and statistics. In contrast, the [[convolution]] <math>f * g</math> (equivalent to the cross-correlation of ''f''(''t'') and ''g''(&minus;''t'') ) gives the probability density function of the sum <math>X + Y</math>.
 ==Explanation==