# Talk:Convolution

Jump to navigation Jump to search
WikiProject Mathematics (Rated C-class, Mid-importance)
This article is within the scope of WikiProject Mathematics, a collaborative effort to improve the coverage of Mathematics on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
Mathematics rating:
 C Class
 Mid Importance
Field:  Analysis
One of the 500 most frequently viewed mathematics articles.
WikiProject Signal Processing This article is within the scope of WikiProject Signal Processing, a collaborative effort to improve the coverage of signal processing on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.

## Introductory definition

The introduction sounds like it might be the same as the functional inner product. It's extremely unclear. — Preceding unsigned comment added by Edgewalker81 (talkcontribs) 19:46, 4 January 2018 (UTC)

The initial definition doesn't mention that one of the function is reversed:

...ing the area overlap between the two functions as a function of the amount that one of the original functions is translated.

Shouldn't it read "translated and reversed"?

Joelthelion (talk) 14:31, 15 September 2014 (UTC)

Or maybe the figure is incorrect? — Preceding unsigned comment added by 198.161.2.212 (talk) 18:07, 22 January 2016 (UTC)

Think you are right. The red line in the figure (function g) should be the same in convolution and cross-correlation. Because - if you convolve f and g and later you correlate f and g(rotated 180 degrees) - you should get the same result.   — Preceding unsigned comment added by 2001:4643:EBFE:0:3923:18BE:E93E:A96 (talk) 14:02, 19 November 2016 (UTC)

If we assume that the two top rows are labeled correctly (with f's and g's), then the 3 red traces should all look identical, as they did in the Oct 7 version. Then what's wrong is the $(f\star g)$ picture. It should look like $(f*g)$ because f(t) is symmetrical. Alternatively, it should be re-labeled $(g\star f).$ --Bob K (talk) 15:40, 19 November 2016 (UTC)
I expanded the figure to depict all the combinations of convolution, correlation, and order of operation (e.g. f*g and g*f). That should help.
--Bob K (talk) 19:04, 19 November 2016 (UTC)
Thank you for correcting it Bob K.  — Preceding unsigned comment added by 2001:4643:EBFE:0:C916:D6C0:D6BD:B173 (talk) 10:06, 20 November 2016 (UTC)


## Derivative of convolution

The Convolution#Differentiation rule section was recently updated from:

${\mathcal {D}}(f*g)={\mathcal {D}}f*g=f*{\mathcal {D}}g\,$ to

${\mathcal {D}}(f*g)={\mathcal {D}}f*g+f*{\mathcal {D}}g\,$ I'm pretty sure it was correct the first time.

We know (using Laplace transform#Proof of the Laplace transform of a function's derivative) that:

${\mathcal {L}}\{{\mathcal {D}}f\}=s{\mathcal {L}}\{f\}$ ${\mathcal {L}}\{{\mathcal {D}}g\}=s{\mathcal {L}}\{g\}$ and that:

{\begin{aligned}{\mathcal {L}}\{{\mathcal {D}}\{f*g\}\}&=s{\mathcal {L}}\{f*g\}\\&=s{\mathcal {L}}\{f\}{\mathcal {L}}\{g\}\\&={\mathcal {L}}\{{\mathcal {D}}f\}{\mathcal {L}}\{g\}\\&={\mathcal {L}}\{f\}{\mathcal {L}}\{{\mathcal {D}}g\}\end{aligned}} Therefore, I've changed it back for now. Oli Filth 15:36, 29 August 2007 (UTC)

Sorry, my mistake, thank you for correcting me :) Crisófilax 16:16, 30 August 2007 (UTC)

Mathworld lists it as the sum of the two terms: http://mathworld.wolfram.com/Convolution.html Can someone look it up in a textbook or verify numerically in Matlab? I'm changing it back to a sum. AhmedFasih (talk) 18:45, 25 February 2008 (UTC)

Update: I tried a simple test (convolving a triangle with a sinusoid, then differentiating) in Matlab, the Mathworld version, D(f*g)=Df*g+f*Dg, is numerically equivalent to the sum of the two expressions previously given here. I am inclined to believe the Mathworld version. AhmedFasih (talk) 18:56, 25 February 2008 (UTC)
A few points:
• I'm aware that Mathworld differs, but I'd stake my life on the fact that it's incorrect on this one.
• See the derivation above for why I think Mathworld is wrong.
• Numerical evaluations of discrete-time convolution can't prove anything about the continuous-time convolution. (The most they can do is indicate what may be the case.)
• However, it would seem you've messed up your experiment; try the code below:
t = [-4*pi:0.01:4*pi];
f = sin(t);
g = zeros(size(t));
g(length(t)/2 - 1 - (0:200)) = linspace(1,0,201);
g(length(t)/2 + (0:200)) = linspace(1,0,201);
Df = f(2:end) - f(1:end-1);
Dg = g(2:end) - g(1:end-1);
Df_g = conv(Df, g);
f_Dg = conv(f, Dg);
fg = conv(f, g);
Dfg = fg(2:end) - fg(1:end-1);
figure
cla, hold on, plot(Dfg, 'b'), plot(f_Dg, 'r'), plot(Df_g, 'k'), plot(Df_g + f_Dg, 'm')

Obviously, if D(f*g) = D(f)*g = f*D(g), then clearly D(f)*g + f*D(g) = 2.D(f)*g, which is what the example above shows.
• Either way, you and I playing around with Matlab is original research; this can't be the basis of anything in the article.
Based on all of this, I'm going to remove the statement of the "convolution rule" until we can get this straightened out. Oli Filth(talk) 20:04, 25 February 2008 (UTC)
Actually, I'm not. See the ref that Michael Slone cited below, or , or p.582 of "Digital Image Processing", Gonzalez + Woods, 2nd. ed. I think we can safely assume that Mathworld is wrong on this one. Oli Filth(talk) 20:14, 25 February 2008 (UTC)
Yes, MathWorld just flubbed it. The derivative is just another impulse response convolution, and these commute. There's no add involved; someone who was editing that page probably got confused, thinking the * was a mutiply. Dicklyon (talk) 03:56, 26 February 2008 (UTC)
FWIW, Mathworld is now corrected. Oli Filth(talk) 22:05, 2 April 2008 (UTC)

In the discrete case (if one sums over all of Z), one can directly compute that D(f * g) = (Df * g). Theorem 9.3 in Wheeden and Zygmund asserts (omitting some details) that if f is in Lp and K is a sufficiently smooth function with compact support, then D(f*K) = f*(DK). The proof appears on pp. 146 – 147. I am no analyst, but this appears to support the claim that convolution does not respect the Leibniz rule. Michael Slone (talk) 20:03, 25 February 2008 (UTC)

I feel you, I just looked it up in my Kamen/Heck "Fundamentals of signals and systems," 2nd ed., p. 125 and you are 100% right. Whew, a research problem just got a little bit easier, thanks much. AhmedFasih (talk) 13:11, 26 February 2008 (UTC)

## The visualization figure

...in the article is good. But it would be even better if the resultant convoluted function was shown. It can be a little bit hard to image in ones brain what the integral of the product of the two shown functions look like as the two functions slide over each other. I am not new to convolution I just have not used it for seven years or so and went here to see and quickly recap what it is all about. And for such a use case of the article a good figure is very powerfull. -- Slaunger 14:15, 24 October 2007 (UTC)

I Agree that the visualization figure is very powerfull, but am reverting to previous version because I believe that now the Visual explanation of convolution figure is clutterring the article and is essentially a helper to the text. It was cleaner before. This is my opinion, if anyone else disagrees we could discuss. --D1ma5ad (talk) 22:28, 2 March 2008 (UTC)
Ok. I can deal with the revert. (For reference purposes, this was the edit in question.) I think the image really needs to go, though, since it overwhelms the lead. Perhaps someone should write an "explanation" section, and include this image as an accompanying visual aid. siℓℓy rabbit (talk) 21:51, 17 July 2008 (UTC)
I have to agree with the first comment, I can think of no reason not to include the convolved function. It would only take a few more inches of real estate and no additional explanation. —Preceding unsigned comment added by 98.167.177.9 (talk) 03:29, 18 December 2008 (UTC)

The integral of their product is the area of the yellow region. or The integral of their product is the area of the triangle. —Preceding unsigned comment added by 58.107.79.95 (talk) 12:34, 4 February 2010 (UTC)

The first red triangle (function g) should be the other way round, as in the lower part of the diagram where f and g overlap. — Preceding unsigned comment added by 134.76.90.227 (talk) 09:58, 15 October 2015 (UTC)

It's the same function g in all three displays. The lower part shows the reflected function. 10:50, 15 October 2015 (UTC)

## Why the time inversion?

The article doesn't explain why g is reversed. What is the point of time inverting it? Egriffin 17:46, 28 October 2007 (UTC)

What do you mean by "the point"? Convolution is defined with time inversion, and as such, happens to have many useful applications. If you don't perform time-inversion, you have cross-correlation instead; which also has lots of useful applications. Oli Filth(talk) 17:56, 28 October 2007 (UTC)
Or why g instead of f? If you look at it, it makes no difference, since the variable of integration could just as well run the other way, and gets integrated out. In the result, you'll find that if either f or g is shifted to later, then their convolution shifts to later. For this to work this way, the integral needs to measure how they align against each other in opposite order. But think of the variable of integration as some "sideways" dimension, not time, and there's not no "time reversal" to bother you. Or think in terms of the PDF of the sum of two independent random variables: their PDFs convolve, as you can work out, but there is no time involved and no reversal except relative inside the integral. Dicklyon (talk) 16:27, 26 February 2008 (UTC)

It helps to think of $\int _{-\infty }^{\infty }f(\tau )g(t-\tau )\,d\tau$ as a weighted average of the function $g(\tau )$ :
• up to the moment "t", if $f(\tau )$ happens to be zero for all negative values of $\tau$ , or
• centered around the moment "t", if $f(\tau )$ happens to be symmetrical around $\tau =0$ .
The weighting coefficient, $f(\tau ),$ for a positive value of $\tau ,$ is the weight applied to the value of function $g$ that occurred $\tau$ units (e.g. "seconds") prior to the moment "t". You may either infer that from the formula, or you may define  $f(\tau )$ that way and infer (i.e. derive) the formula from that definition.
Maybe your point is that something like this needs to be stated in the article, not here.
--Bob K (talk) 12:03, 1 May 2008 (UTC)
This question worth considering in the article, IMO. I do not know about other fields, but convolution is a very important thing in the theory of abstract systems: you apply some input u(t) and the theory computes output y(t). The relationship between y(t) and u(t), the system, is usually given implicitly, by a differential equation, which is easily resolved into the from y = Hu in Laplace domain. The inverse Laplace brings solution back to time domain and it is a convolution between impulse response h(t) and u(t). Taking into account that any function f(t) can be represented as a sum of impulses that are 0 everywhere besides t, where impulse is f, you can understand: why convolution. Though impulse response, h(t), depends on system, it usually looks like a single smoothed triangle, close to 0+ on t-axis, since the reference impulse, δ(x), is applied at time 0. If input impulse is applied at different time, τ, the argument to the h function is reduced by that amount, h(t-τ). The output y at moment t is the sum of all such impulse responses (every is scaled by amplitude of input, u(τ)). This explains both integration over τ ∈ [0,t] (input is zero before 0 and future, time > t, must not affect y(t)) and multiplication of u(τ) by some time inverse' h(t-τ). Look at the illustration of two-impulse u: This can be put the other way around: τ is the time difference between impulse time t-τ and current time t, thus its contribution into y(t) is u(t-τ)·h(τ). Surely, this helps and must be explained in the article. Additionally, in this picture, no signal is reversed in time. In this view, convolution has nothing to do with correlation. The reference to correlation is misleading, IMO. --Javalenok (talk) 20:23, 28 July 2010 (UTC)

I have realized that, in discrete domain, it is even easier to explain. Take a difference relation, e.g. x[k+1] = Ax[k] + Bu[k], where x is a state and u is the input vector. Then, x = Ax + Bu = A(Ax + Bu) + Bu = A2x + ABu + Bu, x = A3x + A2Bu + ABu + Bu[u],

$x_{k}=A^{k}x_{0}+\sum _{i=0}^{k-1}{A^{k-1-i}}Bu_{i}$ .


The convolution seems to come up every time you solve a non-homogeneous differential equation. That is, dx/dt = ax(t) + bu(t) has a solution

$x(t)=x(0)e^{at}+\int _{0}^{t}{e^{a(t-\tau )}bu(\tau )}d\tau$ .
`

Herein, h[k] = AkB and h(t) = eatb are the impulse responses. --Javalenok (talk) 06:57, 1 August 2010 (UTC)

I think the clearest distillation of this post is: the contribution at time t of an impulse u occurring at time t−τ is u(t−τ)h(τ). However, I also agree with Dicklyon that the main point seems to be the symmetry of the convolution. All other considerations are too application-specific to be a compelling answer to "why" the time inversion happens. The most familiar case of convolution is the product of two polynomials (or series):
$(\sum a_{n}x^{n})(\sum b_{n}x^{n})=\sum _{n}\left(\sum _{k=0}^{n}a_{k}b_{n-k}\right)x^{n}.$ Here there's no deep reason "why" it's k in one and nk in the other: it just works out that way so that the total homogeneity on each term in n. The Fourier series of a function, as a series in the variable x = e, obeys the same relation (with a few trivial changes), and again there is no additional need to look for an explanation of "why". Sławomir Biały (talk) 14:40, 1 August 2010 (UTC)

The question of 'why the time inversion?' is because the mathematical formula for the continuous time convolution has a part that involves $g(t-\tau )$ . The formula is not straight forward because there is an important piece of information that is often left out, which confuses a lot of people. The term $g(-\tau )$ is actually quite misleading. It should always be mentioned that an intermediate function such as p(-t) first needs to be formed, where p(-t) is purposely defined to be the mirror image of g(t) along the vertical axis. And once this mirror function p(-t) is defined, we then completely forget about the original function g(t), and we form a "brand new" function g(-t) = p(-t). Here, g(-t) should no longer be associated with the original g(t) function, since there is no direct connection - which is clearly understood by taking a hypothetical example case of the original g(t) function being causal, or equal to zero for t < 0, implying g(-t) = 0 when '-t' is negative. Obviously, the 'actual' g(-t) function is nothing like the newly "fabricated" g(-t) function. Hence we do not associate the newly constructed g(-t) function with the original g(t). The last step is to change the variable 't' to 'tau', leading to $g(-\tau )$ , which then leads to $g(t-\tau )$ for the sliding procedure. The main point is that g(-t) is not directly related to the original function g(t), even though the same alphabetic letter 'g' is used for both of these functions. That's the catch. And, in my opinion, it's very poor form to teach people that g(-t) is the reflection of g(t) when we all know it is not how it is. Also, for discrete-time convolution, the 'time reversal' step merely provides a 'graphical way' to implement the discrete summation formula. KorgBoy (talk) 07:16, 2 January 2019 (UTC)

## Note to associativity

(H * δ') * 1 = (H' * δ) * 1 = (δ * δ) * 1 = δ * 1 = 1

H * (δ' * 1) = H * (δ * 1') = H * (δ * 0) = H * 0 = 0

where H represents heaviside's step function whose derivative is dirac's delta function

• sorry for the form in which I am presenting this, but I am not well familiarized with input of math equations
• hello, this article seems to be about convolution of functions. If you want to consider distributions, then you must impose another assumption, for example: One of the distributions must have a compact support for associativity to hold. In your example, certainly only Dirac has compact support. 78.128.195.197 (talk) 21:37, 10 March 2009 (UTC)
• * sorry for errors, i was writing about kommutativity. For association to hold, all but one of distributions must have compact support. At least i think so. 78.128.195.197 (talk) 23:23, 10 March 2009 (UTC)

## intro section

This section is so incredibly abstract! Get a grip, you lofty mathematicians! I much prefer Wolfram's opening sentence: "A convolution is an integral that expresses the amount of overlap of one function g as it is shifted over another function f." -Reddaly (talk) 22:05, 4 August 2008 (UTC)

Good point. I prefer starting out very basic and working up to the "heights". Then readers can drop out when they find themselves outside their own comfort zone.
--Bob K (talk) 01:12, 6 August 2008 (UTC)
Could some application be mentioned in the intro? (A more detailed entry about the application can be put in the applications section.) This introduction is very mathematical, which may not be appropriate because the audience is not strictly mathematicians. I propose the following blurb be added to the into: "In physical systems, the convolution operation models the interaction of signals with systems. That is, the output y(t) of a electronic filter may be predicted by convolving the input signal x(t) with a characterization of the system h(t); y(t) = x(t) * h(t). neffk (talk) 22:12, 22 June 2009 (UTC)
Something like this should be added to the lead. 173.75.157.179 (talk) 01:03, 9 October 2009 (UTC)

## Comments

• I would like to see some discussion of the convolution in several variables, since this is important in areas of mathematics outside signal processing, such as partial differential equations and harmonic analysis. For instance, solutions to a linear pde such as the heat equation can be obtained on suitable domains by taking a convolution with a fundamental solution. I also note that some of the applications listed in the article clearly require convolutions of several variables. However, the current article focuses exclusively on the one-variable case. I think this is a rather serious limitation.
• Nothing is said here about the domain of definition, merely that ƒ and g are two "functions". I would like to see some discussion of the fact that the convolution is well-defined (by the formula given) if ƒ and g are two Lebesgue integrable functions (i.e., L1 functions). Moreover, if just ƒ is L1 and g is Lp, then ƒ*g is Lp.
• Furthermore, continuing the above comment, some of the basic analytic properties of convolution should also be covered. Among these are the estimate that if ƒ ∈ L1(Rd) and g ∈ Lp(Rd), then
$\|f\ast g\|_{p}\leq \|f\|_{1}\|g\|_{p}.$ From this estimate, a great many important results follow on the convergence in the mean of convolutions of functions. It can be used, for instance, to show that smooth functions with compact support are dense in the Lp spaces. The process of smoothing a function by taking a convolution with a mollifier also deserves to be included as this has applications, not only in proving the aforementioned result, but is also a ubiquitous principle used in applications (such as Gaussian blur).
• I also feel that a section should be created which at least mentions convolution of a function with a distribution (and related definitions), since these are significant for applications to PDE where one needs to be able to make sense of expressions such as ƒ*δ where δ is the delta function. The article itself seems to treat convolution with a distribution as well-defined implicitly. I would prefer to have this made explicit, as well as the precise conditions under which the convolution may be defined.

I'm not sure how to reorganize the article. I am leaning towards an expansion of the "Definition" section to include the case of several variables, and moving some of the discussion particular to signal processing somewhere else (I'm not sure where yet). The definition section should, I think, be followed by a "Domain of definition" section containing the details of what sort of functions (and distributions) are allowed. This should probably be followed by "Properties" (I think the circular and discrete convolutions should be grouped together with the other generalizations given towards the end of the article). I would then like to expand the "Properties" section. siℓℓy rabbit (talk) 14:03, 16 July 2008 (UTC)

I would prefer the article to build up to generalizations, like several variables. Many important points can be made first, with just the one-variable case. Fourier transforms can be defined in multiple dimensions, but we don't begin the article with that.
--Bob K (talk) 21:29, 17 July 2008 (UTC)
It is my intention to build up to generalizations (such as the convolution of a function with a distribution). However, I don't think the convolution on Rd is a very substantial generalization. The discrete version immediately following the "Definition" section is much more substantial, and probably less often used. siℓℓy rabbit (talk) 22:53, 17 July 2008 (UTC)
Not that it matters, but I would have guessed that the discrete version is the most commonly used form in this "digital age" of FIR filtering.
--Bob K (talk) 01:07, 18 July 2008 (UTC)