Talk:Convolution

From Wikipedia, the free encyclopedia
Jump to: navigation, search
WikiProject Mathematics (Rated C-class, Mid-importance)
WikiProject Mathematics
This article is within the scope of WikiProject Mathematics, a collaborative effort to improve the coverage of Mathematics on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
Mathematics rating:
C Class
Mid Importance
 Field: Analysis
One of the 500 most frequently viewed mathematics articles.

Derivative of convolution[edit]

The Convolution#Differentiation rule section was recently updated from:

\mathcal{D}(f  * g) = \mathcal{D}f  * g = f  * \mathcal{D}g \,

to

\mathcal{D}(f  * g) = \mathcal{D}f  * g + f  * \mathcal{D}g \,

I'm pretty sure it was correct the first time.

We know (using Laplace transform#Proof of the Laplace transform of a function's derivative) that:

 \mathcal{L}\{\mathcal{D}f\} = s\mathcal{L}\{f\}
 \mathcal{L}\{\mathcal{D}g\} = s\mathcal{L}\{g\}

and that:


\begin{align}
  \mathcal{L}\{\mathcal{D}\{f * g\}\} &= s\mathcal{L}\{f * g\}  \\
                                      &= s\mathcal{L}\{f\}\mathcal{L}\{g\}  \\
                                      &= \mathcal{L}\{\mathcal{D}f\}\mathcal{L}\{g\} \\
                                      &= \mathcal{L}\{f\}\mathcal{L}\{\mathcal{D}g\}
\end{align}

Therefore, I've changed it back for now. Oli Filth 15:36, 29 August 2007 (UTC)

Sorry, my mistake, thank you for correcting me :) Crisófilax 16:16, 30 August 2007 (UTC)

Mathworld lists it as the sum of the two terms: http://mathworld.wolfram.com/Convolution.html Can someone look it up in a textbook or verify numerically in Matlab? I'm changing it back to a sum. AhmedFasih (talk) 18:45, 25 February 2008 (UTC)

Update: I tried a simple test (convolving a triangle with a sinusoid, then differentiating) in Matlab, the Mathworld version, D(f*g)=Df*g+f*Dg, is numerically equivalent to the sum of the two expressions previously given here. I am inclined to believe the Mathworld version. AhmedFasih (talk) 18:56, 25 February 2008 (UTC)
A few points:
  • I'm aware that Mathworld differs, but I'd stake my life on the fact that it's incorrect on this one.
  • See the derivation above for why I think Mathworld is wrong.
  • Numerical evaluations of discrete-time convolution can't prove anything about the continuous-time convolution. (The most they can do is indicate what may be the case.)
  • However, it would seem you've messed up your experiment; try the code below:
t = [-4*pi:0.01:4*pi];
f = sin(t);
g = zeros(size(t));
g(length(t)/2 - 1 - (0:200)) = linspace(1,0,201);
g(length(t)/2 + (0:200)) = linspace(1,0,201);
Df = f(2:end) - f(1:end-1);
Dg = g(2:end) - g(1:end-1);
Df_g = conv(Df, g);
f_Dg = conv(f, Dg);
fg = conv(f, g);
Dfg = fg(2:end) - fg(1:end-1);
figure
cla, hold on, plot(Dfg, 'b'), plot(f_Dg, 'r'), plot(Df_g, 'k'), plot(Df_g + f_Dg, 'm')
Obviously, if D(f*g) = D(f)*g = f*D(g), then clearly D(f)*g + f*D(g) = 2.D(f)*g, which is what the example above shows.
  • Either way, you and I playing around with Matlab is original research; this can't be the basis of anything in the article.
Based on all of this, I'm going to remove the statement of the "convolution rule" until we can get this straightened out. Oli Filth(talk) 20:04, 25 February 2008 (UTC)
Actually, I'm not. See the ref that Michael Slone cited below, or [1], or p.582 of "Digital Image Processing", Gonzalez + Woods, 2nd. ed. I think we can safely assume that Mathworld is wrong on this one. Oli Filth(talk) 20:14, 25 February 2008 (UTC)
Yes, MathWorld just flubbed it. The derivative is just another impulse response convolution, and these commute. There's no add involved; someone who was editing that page probably got confused, thinking the * was a mutiply. Dicklyon (talk) 03:56, 26 February 2008 (UTC)
FWIW, Mathworld is now corrected. Oli Filth(talk) 22:05, 2 April 2008 (UTC)

In the discrete case (if one sums over all of Z), one can directly compute that D(f * g) = (Df * g). Theorem 9.3 in Wheeden and Zygmund asserts (omitting some details) that if f is in Lp and K is a sufficiently smooth function with compact support, then D(f*K) = f*(DK). The proof appears on pp. 146 – 147. I am no analyst, but this appears to support the claim that convolution does not respect the Leibniz rule. Michael Slone (talk) 20:03, 25 February 2008 (UTC)

I feel you, I just looked it up in my Kamen/Heck "Fundamentals of signals and systems," 2nd ed., p. 125 and you are 100% right. Whew, a research problem just got a little bit easier, thanks much. AhmedFasih (talk) 13:11, 26 February 2008 (UTC)

The visualization figure[edit]

...in the article is good. But it would be even better if the resultant convoluted function was shown. It can be a little bit hard to image in ones brain what the integral of the product of the two shown functions look like as the two functions slide over each other. I am not new to convolution I just have not used it for seven years or so and went here to see and quickly recap what it is all about. And for such a use case of the article a good figure is very powerfull. -- Slaunger 14:15, 24 October 2007 (UTC)

I Agree that the visualization figure is very powerfull, but am reverting to previous version because I believe that now the Visual explanation of convolution figure is clutterring the article and is essentially a helper to the text. It was cleaner before. This is my opinion, if anyone else disagrees we could discuss. --D1ma5ad (talk) 22:28, 2 March 2008 (UTC)
Ok. I can deal with the revert. (For reference purposes, this was the edit in question.) I think the image really needs to go, though, since it overwhelms the lead. Perhaps someone should write an "explanation" section, and include this image as an accompanying visual aid. siℓℓy rabbit (talk) 21:51, 17 July 2008 (UTC)
I have to agree with the first comment, I can think of no reason not to include the convolved function. It would only take a few more inches of real estate and no additional explanation. —Preceding unsigned comment added by 98.167.177.9 (talk) 03:29, 18 December 2008 (UTC)

The integral of their product is the area of the yellow region. or The integral of their product is the area of the triangle. —Preceding unsigned comment added by 58.107.79.95 (talk) 12:34, 4 February 2010 (UTC)

Why the time inversion?[edit]

The article doesn't explain why g is reversed. What is the point of time inverting it? Egriffin 17:46, 28 October 2007 (UTC)

What do you mean by "the point"? Convolution is defined with time inversion, and as such, happens to have many useful applications. If you don't perform time-inversion, you have cross-correlation instead; which also has lots of useful applications. Oli Filth(talk) 17:56, 28 October 2007 (UTC)
Or why g instead of f? If you look at it, it makes no difference, since the variable of integration could just as well run the other way, and gets integrated out. In the result, you'll find that if either f or g is shifted to later, then their convolution shifts to later. For this to work this way, the integral needs to measure how they align against each other in opposite order. But think of the variable of integration as some "sideways" dimension, not time, and there's not no "time reversal" to bother you. Or think in terms of the PDF of the sum of two independent random variables: their PDFs convolve, as you can work out, but there is no time involved and no reversal except relative inside the integral. Dicklyon (talk) 16:27, 26 February 2008 (UTC)


It helps to think of \int_{-\infty}^{\infty} f(\tau) g(t - \tau)\, d\tau as a weighted average of the function g(\tau):
    • up to the moment "t", if f(\tau) happens to be zero for all negative values of \tau, or
    • centered around the moment "t", if f(\tau) happens to be symmetrical around \tau=0.
The weighting coefficient, f(\tau), for a positive value of \tau, is the weight applied to the value of function g that occurred \tau units (e.g. "seconds") prior to the moment "t". You may either infer that from the formula, or you may define  f(\tau) that way and infer (i.e. derive) the formula from that definition.
Maybe your point is that something like this needs to be stated in the article, not here.
--Bob K (talk) 12:03, 1 May 2008 (UTC)
This question worth considering in the article, IMO. I do not know about other fields, but convolution is a very important thing in the theory of abstract systems: you apply some input u(t) and the theory computes output y(t). The relationship between y(t) and u(t), the system, is usually given implicitly, by a differential equation, which is easily resolved into the from y = Hu in Laplace domain. The inverse Laplace brings solution back to time domain and it is a convolution between impulse response h(t) and u(t). Taking into account that any function f(t) can be represented as a sum of impulses that are 0 everywhere besides t, where impulse is f, you can understand: why convolution. Though impulse response, h(t), depends on system, it usually looks like a single smoothed triangle, close to 0+ on t-axis, since the reference impulse, δ(x), is applied at time 0. If input impulse is applied at different time, τ, the argument to the h function is reduced by that amount, h(t-τ). The output y at moment t is the sum of all such impulse responses (every is scaled by amplitude of input, u(τ)). This explains both integration over τ ∈ [0,t] (input is zero before 0 and future, time > t, must not affect y(t)) and multiplication of u(τ) by some `time inverse' h(t-τ). Look at the illustration of two-impulse u:Convolution of two pulses with impulse response.svg
This can be put the other way around: τ is the time difference between impulse time t-τ and current time t, thus its contribution into y(t) is u(t-τ)·h(τ). Surely, this helps and must be explained in the article. Additionally, in this picture, no signal is reversed in time. In this view, convolution has nothing to do with correlation. The reference to correlation is misleading, IMO. --Javalenok (talk) 20:23, 28 July 2010 (UTC)

I have realized that, in discrete domain, it is even easier to explain. Take a difference relation, e.g. x[k+1] = Ax[k] + Bu[k], where x is a state and u is the input vector. Then, x[2] = Ax[1] + Bu[1] = A(Ax[0] + Bu[0]) + Bu[1] = A2x[0] + ABu[0] + Bu[1], x[3] = A3x[0] + A2Bu[0] + ABu[1] + Bu[u],

x_k = A^kx_0 + \sum_{i=0}^{k-1}{A^{k-1-i}}Bu_i. 

The convolution seems to come up every time you solve a non-homogeneous differential equation. That is, dx/dt = ax(t) + bu(t) has a solution

x(t) = x(0)e^{at} + \int_0^t{e^{a(t-\tau)}bu(\tau)}d\tau. 

Herein, h[k] = AkB and h(t) = eatb are the impulse responses. --Javalenok (talk) 06:57, 1 August 2010 (UTC)

I think the clearest distillation of this post is: the contribution at time t of an impulse u occurring at time t−τ is u(t−τ)h(τ). However, I also agree with Dicklyon that the main point seems to be the symmetry of the convolution. All other considerations are too application-specific to be a compelling answer to "why" the time inversion happens. The most familiar case of convolution is the product of two polynomials (or series):
(\sum a_nx^n)(\sum b_nx^n) = \sum_n \left(\sum_{k=0}^n a_kb_{n-k}\right)x^n.
Here there's no deep reason "why" it's k in one and nk in the other: it just works out that way so that the total homogeneity on each term in n. The Fourier series of a function, as a series in the variable x = e, obeys the same relation (with a few trivial changes), and again there is no additional need to look for an explanation of "why". Sławomir Biały (talk) 14:40, 1 August 2010 (UTC)

Note to associativity[edit]

(H * δ') * 1 = (H' * δ) * 1 = (δ * δ) * 1 = δ * 1 = 1

H * (δ' * 1) = H * (δ * 1') = H * (δ * 0) = H * 0 = 0

where H represents heaviside's step function whose derivative is dirac's delta function

    • sorry for the form in which I am presenting this, but I am not well familiarized with input of math equations
  • hello, this article seems to be about convolution of functions. If you want to consider distributions, then you must impose another assumption, for example: One of the distributions must have a compact support for associativity to hold. In your example, certainly only Dirac has compact support. 78.128.195.197 (talk) 21:37, 10 March 2009 (UTC)
  • * sorry for errors, i was writing about kommutativity. For association to hold, all but one of distributions must have compact support. At least i think so. 78.128.195.197 (talk) 23:23, 10 March 2009 (UTC)

intro section[edit]

This section is so incredibly abstract! Get a grip, you lofty mathematicians! I much prefer Wolfram's opening sentence: "A convolution is an integral that expresses the amount of overlap of one function g as it is shifted over another function f." -Reddaly (talk) 22:05, 4 August 2008 (UTC)

Good point. I prefer starting out very basic and working up to the "heights". Then readers can drop out when they find themselves outside their own comfort zone.
--Bob K (talk) 01:12, 6 August 2008 (UTC)
Could some application be mentioned in the intro? (A more detailed entry about the application can be put in the applications section.) This introduction is very mathematical, which may not be appropriate because the audience is not strictly mathematicians. I propose the following blurb be added to the into: "In physical systems, the convolution operation models the interaction of signals with systems. That is, the output y(t) of a electronic filter may be predicted by convolving the input signal x(t) with a characterization of the system h(t); y(t) = x(t) * h(t). neffk (talk) 22:12, 22 June 2009 (UTC)
Something like this should be added to the lead. 173.75.157.179 (talk) 01:03, 9 October 2009 (UTC)

Comments[edit]

  • I would like to see some discussion of the convolution in several variables, since this is important in areas of mathematics outside signal processing, such as partial differential equations and harmonic analysis. For instance, solutions to a linear pde such as the heat equation can be obtained on suitable domains by taking a convolution with a fundamental solution. I also note that some of the applications listed in the article clearly require convolutions of several variables. However, the current article focuses exclusively on the one-variable case. I think this is a rather serious limitation.
  • Nothing is said here about the domain of definition, merely that ƒ and g are two "functions". I would like to see some discussion of the fact that the convolution is well-defined (by the formula given) if ƒ and g are two Lebesgue integrable functions (i.e., L1 functions). Moreover, if just ƒ is L1 and g is Lp, then ƒ*g is Lp.
  • Furthermore, continuing the above comment, some of the basic analytic properties of convolution should also be covered. Among these are the estimate that if ƒ ∈ L1(Rd) and g ∈ Lp(Rd), then
\|f\ast g\|_p\le \|f\|_1\|g\|_p.
From this estimate, a great many important results follow on the convergence in the mean of convolutions of functions. It can be used, for instance, to show that smooth functions with compact support are dense in the Lp spaces. The process of smoothing a function by taking a convolution with a mollifier also deserves to be included as this has applications, not only in proving the aforementioned result, but is also a ubiquitous principle used in applications (such as Gaussian blur).
  • I also feel that a section should be created which at least mentions convolution of a function with a distribution (and related definitions), since these are significant for applications to PDE where one needs to be able to make sense of expressions such as ƒ*δ where δ is the delta function. The article itself seems to treat convolution with a distribution as well-defined implicitly. I would prefer to have this made explicit, as well as the precise conditions under which the convolution may be defined.

I'm not sure how to reorganize the article. I am leaning towards an expansion of the "Definition" section to include the case of several variables, and moving some of the discussion particular to signal processing somewhere else (I'm not sure where yet). The definition section should, I think, be followed by a "Domain of definition" section containing the details of what sort of functions (and distributions) are allowed. This should probably be followed by "Properties" (I think the circular and discrete convolutions should be grouped together with the other generalizations given towards the end of the article). I would then like to expand the "Properties" section. siℓℓy rabbit (talk) 14:03, 16 July 2008 (UTC)

I would prefer the article to build up to generalizations, like several variables. Many important points can be made first, with just the one-variable case. Fourier transforms can be defined in multiple dimensions, but we don't begin the article with that.
--Bob K (talk) 21:29, 17 July 2008 (UTC)
It is my intention to build up to generalizations (such as the convolution of a function with a distribution). However, I don't think the convolution on Rd is a very substantial generalization. The discrete version immediately following the "Definition" section is much more substantial, and probably less often used. siℓℓy rabbit (talk) 22:53, 17 July 2008 (UTC)
Not that it matters, but I would have guessed that the discrete version is the most commonly used form in this "digital age" of FIR filtering.
--Bob K (talk) 01:07, 18 July 2008 (UTC)

References[edit]

Some additional references I plan to use in my improvements of the article are:

I plan to add more to this list as I progress. siℓℓy rabbit (talk) 15:07, 16 July 2008 (UTC)

Cyclic discrete convolution[edit]

I think this notation is too cryptic:

c_k = \sum_{i+j\equiv k\mod n} a_ib_j.

Please keep in mind that this is an encyclopedia, and we have plenty of space.

--Bob K (talk) 15:46, 18 July 2008 (UTC)

I think that notation is extremely clarifying; we should probably add it alongside the more traditional notation, since we have room. Dicklyon (talk) 16:00, 18 July 2008 (UTC)


The "ugly" one, i.e.:

c_k = \sum_{i=0}^k a_ib_{k-i} + \sum_{i=1}^{n-k-1} a_{k+i}b_{n-i}.

is understandable to me. It's what you get if you periodically extend the b sequence with period n, reverse it in time, delay it by k, and then sum the product of the sequences over the extent of a (or equivalently to infinity).

So a more elegant way of making the point without losing clarity is:

c[k]\, = (a * b_n)_k\,      for k=0,1,2,...,n-1
= \sum_{i=0}^{n-1} a[i] \cdot b_n[k-i]\,

where bn is the periodic extension of b:

b_n[i] \ \stackrel{\mathrm{def}}{=} \ \sum_{m=-\infty}^{\infty} b[i - mn] = \sum_{m=-\infty}^{\infty} b[i + mn] .

Those who already understand normal convolution will be able to leverage that insight to help them understand this. Those who don't understand normal convolution either shouldn't be here at all, or they should be inspired to go back and brush up on it.

--Bob K (talk) 20:44, 18 July 2008 (UTC)

Why not just leverage the circularity of the mod function at least:
c_k = \sum_{i=0}^n a[i]\cdot b[(k-i)\mod{n}]
Dicklyon (talk) 21:26, 18 July 2008 (UTC)

That works too. But since this whole article is about convolution, it's better to reinforce that relationship, IMHO. Since they are not mutually exclusive we can, and probably should, mention both explanations. Let the reader pick his own poison.

--Bob K (talk) 21:34, 18 July 2008 (UTC)

I think the section looks pretty good now. One thing I would like is a good reference for the periodization of the discrete functions. A text on Fourier series seems a natural place to look. I'll see what I can dig up. siℓℓy rabbit (talk) 03:44, 19 July 2008 (UTC)

Translation invariance[edit]

I have added a short and rough section on translation invariance. Obviously more should be said, since this is "the" most important property of convolutions. Hopefully we can iterate to get this new addition of mine into a more reasonable shape. siℓℓy rabbit (talk) 16:12, 24 July 2008 (UTC)

Intuition[edit]

I am planning on adding a section to this article with some intuitive conceptions of the convolution (I am using Professor David Jerison's "bank account" analogy) with regards to its usage in solving ordinary differential equations with rest initial conditions. This intuition is primarily useful for introductory undergraduate differential equations, and as far as I can tell, not so much for advanced mathematics. This is possibly more relevant to the article superposition principle or Green's function and I would not have any objections if it moved over there, but it would be nice to see some a more layman explanation for this. — Edward Z. Yang(Talk) 22:10, 23 April 2009 (UTC)

Ed, I reverted the long intuition example section. If this kind of example can be sourced, maybe we can use it; but I didn't have the impression that it was particularly helpful or intuitive, being basically a bunch of integral math and such, just like the rest of the article. Convolution is not so tricky that we need to resort to bank deposits to explain it, I think. Dicklyon (talk) 05:18, 24 April 2009 (UTC)
Source is here, albeit for a different example. This also works, although I am not sure if the link will always be available there. While the convolution may not be tricky (it is a relatively simple mathematical operation), many people will find the existence of a concrete example and/or usage of the term to help them understand what a convolution is, and why they might care. I have not restored the section. — Edward Z. Yang(Talk) 13:54, 24 April 2009 (UTC)
Those don't support you novel approach of making a continuous compound interest problem out of it, which is itself an unfamiliar concept to anyone who is likely to have trouble with the math. And they're not exactly WP:RS. Dicklyon (talk) 15:44, 24 April 2009 (UTC)
Sure. We can remove the analogy from it, and then reframe the section as a proof of the fundamental solution theorem. I would call them reputable sources under the self-publishing articles, but if that's not sufficient I can also find alternative sources. — Edward Z. Yang(Talk) 16:45, 24 April 2009 (UTC)
In an area with so much published, there's probably no reason to resort to self-published material. Dicklyon (talk) 19:37, 24 April 2009 (UTC)
It seems to me that because it's so elementary, you're not going to find peer-reviewed material appropriate for this article. Would a textbook be acceptable, in your eyes? — Edward Z. Yang(Talk) 22:12, 24 April 2009 (UTC)

Special characters template[edit]

I don't think this article needs the special characters template at all. The template makes sense if an article would unexpectedly have special characters (for example, if they appeared in a biography). But so many math articles have "special" characters that it's not helpful to put a template on every math article. — Carl (CBM · talk) 10:38, 16 May 2009 (UTC)

Good point, but that doesn't help the guy who's looking at his first Wikipedia article and seeing weird empty boxes. In fact I put up with those empty boxes long after my first article, because I simply didn't know what to do about them. I thought everyone was seeing the same thing I was seeing and that someday Wikipedia would fix "their problem".
--Bob K (talk) 13:12, 16 May 2009 (UTC)
If we wanted a note on every article with special characters, we would do that directly in the software. I don't think we should put a template on the vast majority of mathematics articles; in practice we don't seem to use it very much. — Carl (CBM · talk) 13:17, 16 May 2009 (UTC)

Explanation of revert[edit]

Translation invariant operators on Lp spaces are by definition operators that commute with all translations. This is completely standard terminology: see any good book on Fourier analysis (E.g. Stein and Weiss, Duoandikotxea, Grafakos, etc.) 71.182.236.76 (talk) 12:44, 31 October 2009 (UTC)

differentiation and convolution[edit]

Two positive integrable and infinitely differentiable functions may have a nowhere continuous convolution. That is there are infinitely differentiable (even entire) probability densities with a nowhere continuous convolution. If this happens when then the functions are badly unbounded.

I have constructed examples in the paper below and wrote a small correction to wikipedia some time ago, which has been "corrected". So I have added a line to the differentiation subtitle and a reference to my paper. I hope this will not appear too selfish.

MR1654158 (99m:60026)

Uludağ, A. Muhammed(F-GREN-FM) On possible deterioration of smoothness under the operation of convolution. (English summary) J. Math. Anal. Appl. 227 (1998), no. 2, 335--358.

As the author states in the introduction: ``It is known that, as a rule, the operation of convolution improves smoothness. In this paper, the above statement is looked at critically. An example is constructed of two probability densities which are restrictions to $\bold R$ of entire functions, though their convolution possesses an infinite essential supremum on each interval. —Preceding unsigned comment added by 134.206.80.237 (talk) 17:52, 4 March 2010 (UTC)

asterisk operator[edit]

I understand where the editor who removed the word "operator"[2] is coming from, since I admittedly never heard a mathematician call it by that name. However, I am somewhat chagrined about the loss of precision. This is not an idiosyncratic choice of words, but the official Unicode name of the character (U+2217). This is the only term that I'm aware of that unambiguously distinguishes it from all other types of asterisks. Providing that name is relevant in the context of the article. I'll therefore reinsert it with a mention of Unicode, which hopefully makes it clearer. — Sebastian 17:07, 22 June 2010 (UTC)

This is strangely over-specific. It is not written in stone that the convolution needs to be the "asterisk operator", many authors simply use an asterisk or conventional five-pointed star, not to mention the fact that the unicode glyph need not be used at all. I will adjust the text accordingly. Sławomir Biały (talk) 18:28, 22 June 2010 (UTC)
Your latest edit, in which you removed any mention of Unicode and even the link to the pertinent article about the symbol appears to be an emotional overreaction. I consider this an edit war, and I will not continue on that path. Instead, I ask you politely to undo at least the excess of your reversion.
You do have a point that there are more characters possible. That invites the comparison with multiplication. Not only does that article have a section on multiplication#Notation and terminology, but we even have an article on the multiplication sign itself. Obviously, the asterisk operator isn't as important as to warrant its own article, but at least is should be mentioned and referred to in the one article that describes its use, don't you think so? — Sebastian 19:18, 22 June 2010 (UTC)
How about if we just add one sentence like this: "Unicode defines the "[[asterisk operator]]" (U+2217) for this."? — Sebastian 19:30, 22 June 2010 (UTC)
You might do well to familiarize yourself with the consensus policy. You were WP:BOLD, and made an edit. I the made a change to your original edit, because I felt that the word "operator" was a poor choice in a mathematical context. You changed this back to mandate the use of unicode. I changed your subsequent edit because it imposes an artificial standard on the typography of the convolution operation. (E.g., the glyph \star is perhaps equally common in the literature.) What I have not seen demonstrated here is that any mention of Unicode is at all appropriate in the article. Does Unicode actually mandate that this character must be used for convolution? If so, can we please see a reference to that effect? Sławomir Biały (talk) 19:42, 22 June 2010 (UTC)

I have a question related to that, regarding the first line of the definition: What is ƒg supposed to be? Is that actually used in literature for the convolution (never saw that before) or is it just supposed to be a placeholder for asterix or star? If the latter is case, it might be better to simply explicitly write both notation rather than having such a placeholder construction, which in doubt might just be irritating to readers.--Kmhkmh (talk) 20:53, 22 June 2010 (UTC)

Sounds like your set-up doesn't display this Unicode character properly — i see ∗ as a slightly larger asterisk, but i'm guessing it appears as an empty box to you? The lack of universal support for the full Unicode set may be another reason for not being too dogmatic about its use. I notice the Table of mathematical symbols only gives the standard asterisk (*) in the HTML column of the entry for convolution. I don't claim to understand the relationship between Unicode and HTML. Qwfp (talk) 22:01, 22 June 2010 (UTC)
Yes I'm seeing an empty box, while the other asterixs are fine. Possibly just my setup, however since I'm using a fairly standard installation (default xp + default firefox) this problem might occur for a larger number of readers, hence it might be a not a good idea to use that unicode charachter in the article.--Kmhkmh (talk) 22:25, 22 June 2010 (UTC)
Hmm, I'm using XP & Firefox too, and i've just tried a few different fonts (Arial, Tahoma, Candara) and its ok in all. Qwfp (talk) 07:06, 23 June 2010 (UTC)
Ah sorry actually I used opera 10.53, when i looked at it. I checked with my firefox 3.5.9 and there it worked. But I don't quite get why we would need to use different asterixs throughout the text anyway, I mean why don't we use * only, since it is used throughout most of the article anyway and is displayed correctly on any system (due to being ascii).--Kmhkmh (talk) 09:17, 23 June 2010 (UTC)

Misleading Animations[edit]

The starting animations don't mention that you have to reflect one of the functions (e.g. filter) for that visualisation to work. (Of course the signal is symmetric, so it's not wrong...) Tinos (talk) 02:03, 24 July 2010 (UTC)

Put a message on the talk page of the guy who made it. You're right; g(t) should be g(tau - t), and (f*g)(t) should be (f*g)(tau). Dicklyon (talk) 03:51, 24 July 2010 (UTC)
I don't think he has a talk page. He's provided the code so I can reproduce it. I think he's right with (f*g)(t), though? (The asterisk still needs fixing, though.)Tinos (talk) 06:31, 24 July 2010 (UTC)
Actually, I think we're both wrong. You need a variable of integration running opposite directions in f and g, so maybe they should be called f(tau) and g(t - tau), with result (f*g)(t). But then there's also the fact that as plotted on the same x axis as the result, the f(tau) is really the same as f(t). It's hard to do this in a way that's not somewhat misleading or confusing. Dicklyon (talk) 06:50, 24 July 2010 (UTC)
I've made the animations. However I just discovered Brian actually has a talk page on Wiki Commons! I've notified him (I think).Tinos (talk) 07:13, 25 July 2010 (UTC)
I've updated the article. I tried synchronising the animations but Firefox opens the first one before the second one! Anyone know how to get Wikipedia to sync them up? I guess it's not important, but would be kind of cool. Feel free to suggest improvements to the animations.Tinos (talk) 10:23, 26 July 2010 (UTC)

Photos, audio?[edit]

This article would benefit from some photographic examples. Photos would be a much more accessible to the general public than abstract signals. For example, a photo, a Gaussian-blurred version, and a version convolved by a round tophat would show how the kernel affects the quality of the output. Also, the effect of a sharpening filter would show that convolution isn't just about blurring.

Similarly, an audio example playing a clean signal, an echo kernel (the echo of a Dirac delta), and then the same signal convolved by the echo kernel. —Ben FrantzDale (talk) 21:49, 26 September 2010 (UTC)

Just added a sharpening example (gold spider). Hope it's alright!
Tinos (talk) 13:17, 25 February 2011 (UTC)

Convolution vs modulation[edit]

Is modulation (such as AM or FM) a form of convolution? —Preceding unsigned comment added by 219.78.87.74 (talk) 11:02, 29 January 2011 (UTC)

Fig 1: An audio signal (top) may be carried by an AM or FM radio wave.
Good question; the answer is "no". Convolution can be thought of as replacing each sample in the time domain with a copy of a fixed kernel, scaled by that sample. In contrast, FM replaces each sample in the time domain with a signal that has an instantaneous frequency corresponding to that sample value. AM is more similar to convolution: In AM, you take a high-frequency sine wave and then scale each sample in that signal by the value of the signal you are encoding. This is a direct product. Imagine trying to cast AM as a convolution. Your kernel would presumably be a little wavelet of fixed frequency, but, e.g., a fixed nonzero signal as input would produce zero as output as each copy of the kernel canceled out the ones around it.
Put another way, each could be modeled like a convolution; with AM you'd change the phase of the kernel at every sample, and with FM you'd change the phase and frequency at each sample, but not the amplitude.
My intuition is that FM is nonlinear and that AM is a linear function of the carrier signal and the input signal, but nonlinear in the input signal alone, and so neither could possibly be described by a convolution, which is linear. —Ben FrantzDale (talk) 15:55, 31 January 2011 (UTC)

transforms[edit]

The first reference given (Bracewell) may be a great book, but it seems to be assuming from the very beginning that the reader is acquainted with a general idea of a transform. The WP lemma transform (mathematics) redirects to Integral transform. But what is a transform in general? Nobody seems to care to give a definition. --84.177.81.254 (talk) 17:35, 25 May 2011 (UTC)

Think of it as a function whose input values and output values are not numbers, instead they are functions. Some times the phrased is used for other things as well, such as the input and output values are vectors. (Such as linear transforms in linear algebra). Thenub314 (talk) 04:55, 26 May 2011 (UTC)
Thanks! So the elements of a lambda algebra must be transforms, too, I think. --84.177.86.232 (talk) 17:37, 26 May 2011 (UTC)

Does it really have to be a spider?[edit]

{{reqdiagram}}

Every time I refer to this article that thing catches me by surprise and I jump ten feet in the air. Why can't we convolve a nice butterfly? — Preceding unsigned comment added by 69.242.232.143 (talk) 03:18, 10 July 2011 (UTC)

I also find the image not quite to my taste. It needs to be too large in order for the effects of the convolution to be visible. A tiny part of the image (showing the hairs on the legs) could be much smaller and yet illustrate the sharpening effect more clearly. Sławomir Biały (talk) 14:33, 10 July 2011 (UTC)
Yes, it is too big. I think someone should crop off a leg; ideally the original artist. Tomgg (talk) 09:56, 9 August 2011 (UTC)
I agree. The same was done on Depth of Field (see Talk:Depth_of_field#Image_of_wolf_spider) for Wikipedia:Accessibility reasons (citing arachnophobia). —Ben FrantzDale (talk) 16:30, 9 August 2011 (UTC)
Another request for this - I can't finish the article because of that damn thing! — Preceding unsigned comment added by 141.163.156.59 (talk) 12:21, 24 November 2011 (UTC)
Done! Sławomir Biały (talk) 12:26, 24 November 2011 (UTC)
No one has requested that the image be removed. Rather, people would like it to be replaced. I think we should keep it in the article until a better one comes along. Yaris678 (talk) 12:51, 24 November 2011 (UTC)
I disagree. Many readers have already complained that the image is a significant impediment to reading the article. It should therefore be removed until someone produces a satisfactory replacement. It has been here for over four months already, despite objections. We shouldn't wait indefinitely for someone to make an alternative. Sławomir Biały (talk) 13:05, 24 November 2011 (UTC)

Start off easy, then push the reader over a cliff[edit]

Another lousy wikipedia article written by self-claimed experts to wank on about how they edited a wikipedia article.

The first paragraph is not bad.

The second paragraph, about the use of convolution in other groups is total wiki bullshit.

Put that crap down lower in the article. I don't care you have a Ph.D and that you are correct. Are you writing for more jackholes or for people that actually want to learn?

Similarly, get rid of stupid animations that have no ability to stop and let the person read.

DEMAND ATTENTION DEMAND ATTENTION DEMAND ATTENTION.

I made an animation, and regardless of how distracting it is, I will force you to read it.

In summation, typical crappy wikipedia article.

68.106.47.124 (talk) 17:35, 6 March 2012 (UTC)

The lead of the article is supposed to provide a capsule summary of the article. It isn't necessarily supposed to be the easiest part. Since much of the article does focus on generalizations of the convolution, the second paragraph is appropriate and roughly proportional to the prominence of this topic in the article. It's true that there should be more detail in the earlier sections on things that a larger group of readers might be interested in, like applications to image processing. These can then also be mentioned in the lead, which has ample room for expansion. At this point, I have no opinion about the animation. I seem to recall lobbying for its removal in the past whilst other editors have favored it's inclusion. I think it should probably stay, at least pending a wider discussion. Sławomir Biały (talk) 18:53, 6 March 2012 (UTC)
I don't understand the ambivalence about the .gif animation. I think it is clearly a positive feature of the article, and an excellent example of what computer-based learning can offer that hardcover books cannot. A pause button would also be a nice, but minor, addition. Perhaps someday Wikipedia will also figure out how to write articles on multiple levels... high school, undergrad, & grad. Good things take time.--Bob K (talk) 12:27, 10 March 2013 (UTC)

Examples of convolution[edit]

I saw the wiki page, but I couldn't find any examples using actual numbers evaluating the formula. Could you give some examples of convolution, please? Mathijs Krijzer (talk) 22:44, 9 March 2013 (UTC)

The two .gif files at the top of the article are doing convolutions with actual numbers. It doesn't get much better than that.
But if you want to "get your own hands dirty", I suggest you download Octave (it's free), and play around with the conv function. You might find some of the info at http://en.wikipedia.org/wiki/User_talk:Olli_Niemitalo#Window_function helpful for getting started. I'm using version 3.6.2. Version 3.4 has an installer, but I found that Octave and qtOctave don't need to be installed. I think the word for that property is "portable".
There are four instances of the conv() function in this script: http://commons.wikimedia.org/wiki/File:Circular_convolution_example.png#Octave_script
They use actual numbers generated elsewhere in the script.
Similarly, there are two instances in this script: http://commons.wikimedia.org/wiki/File:Overlap-save_algorithm.png#Octave_script
--Bob K (talk) 00:44, 10 March 2013 (UTC)
Yes, but the .gif files don't explain all the formulas step by step. What does this mean?:

In other words, compute a sliding, weighted-average of function f(\tau), where the weighting function is g(-\tau).
The resulting waveform (not shown here) is the convolution of functions f and g. If f(t) is a unit impulse, the result of this process is simply g(t), which is therefore called the impulse response.

--Mathijs Krijzer (talk) 13:42, 10 March 2013 (UTC)
Sorry... it took me a while to realize that you've raised a very important point. The article needs to say more about topics such as Dirac_delta_function#Translation and LTI_system_theory#Impulse_response_and_convolution. In the meantime, maybe those links will help you. --Bob K (talk) 19:24, 11 March 2013 (UTC)
It already does, in plain sight actually, in more than one way. Same goes for the examples listed below. Usual problem with articles like this. Engineers can't read math and math people don't care to translate. Mct mht (talk) 21:32, 18 March 2013 (UTC)

Definition[edit]

The convolution of f and g is written fg, using an asterisk or star. It is defined as the integral of the product of the two functions after one is reversed and shifted. As such, it is a particular kind of integral transform:

(f * g )(t)\ \ \,   \stackrel{\mathrm{def}}{=}\ \int_{-\infty}^\infty f(\tau)\, g(t - \tau)\, d\tau
= \int_{-\infty}^\infty f(t-\tau)\, g(\tau)\, d\tau.       (commutativity)

Domain of definition[edit]

The convolution of two complex-valued functions on Rd

(f*g)(x) = \int_{\mathbf{R}^d}f(y)g(x-y)\,dy

is well-defined only if f and g decay sufficiently rapidly at infinity in order for the integral to exist. Conditions for the existence of the convolution may be tricky, since a blow-up in g at infinity can be easily offset by sufficiently rapid decay in f. The question of existence thus may involve different conditions on f and g.

Circular discrete convolution[edit]

When a function gN is periodic, with period N, then for functions, f, such that fgN exists, the convolution is also periodic and identical to:

(f * g_N)[n] \equiv \sum_{m=0}^{N-1} \left(\sum_{k=-\infty}^\infty {f}[m+kN] \right) g_N[n-m].\,

Circular convolution[edit]

Main article: Circular convolution

When a function gT is periodic, with period T, then for functions, f, such that fgT exists, the convolution is also periodic and identical to:

(f * g_T)(t) \equiv \int_{t_0}^{t_0+T} \left[\sum_{k=-\infty}^\infty f(\tau + kT)\right] g_T(t - \tau)\, d\tau,

where to is an arbitrary choice. The summation is called a periodic summation of the function f.

Discrete convolution[edit]

For complex-valued functions f, g defined on the set Z of integers, the discrete convolution of f and g is given by:

(f * g)[n]\ \stackrel{\mathrm{def}}{=}\ \sum_{m=-\infty}^\infty f[m]\, g[n - m]
= \sum_{m=-\infty}^\infty f[n-m]\, g[m].       (commutativity)

When multiplying two polynomials, the coefficients of the product are given by the convolution of the original coefficient sequences, extended with zeros where necessary to avoid undefined terms; this is known as the Cauchy product of the coefficients of the two polynomials.

Lead images[edit]

The lead images to this article are not good. They are so wide that they completely dominate the lead, squashing all of the text aside. Moreover, the new captions I think are worse. They are much less succinct that the original captions, and dwell on things like the difference between cross-correlation and convolution, which the images don't even illustrate. If someone wants to write an examples section that includes these (or other) images, along with detailed explanations, that might be better. Sławomir Biały (talk) 01:50, 12 March 2013 (UTC)

I think the .gif "movies" are not helpful at their current location. They would be more helpful with the formulas (Definition section) that they serve to illustrate. The reason I brought up cross-correlation is so that I could equate g(t-\tau) to g(\tau-t), which is easier to grasp. It was awkward, I agree, but maybe less awkward than g(t-\tau). That was the hope. I also thought it important to say how  t is defined. But all those details became cumbersome. I think it is better now. --Bob K (talk) 04:25, 12 March 2013 (UTC)

Deleted References to GNU C-Graph[edit]

The following is re-posted, here, on advice from PamD noted at the end of this section.

Long before I wrote the proposed article "GNU C-Graph", I included references under the articles Convolution theorem and Convolution for the benefit of students and their professors teaching and learning about convolution. These references were removed by MrOllie and Hu12. Notwithstanding my belief that GNU C-Graph meets the WP:Notability criteria, Mark viking pointed out "external links aren't required to be WP notable, just relevant".

Most reasonable people would agree that removing the reference to the only software that easily demonstrates the convolution theorem would be a disservice to the public. The text of the references reads:

Under Convolution#External_links:

  • GNU C-Graph, a free software package for visualizing the convolution theorem, and displaying signals with their spectral plots.

Under Convolution_theorem#Additional_resources:

  • GNU C-Graph, a free software package for demonstrating the convolution theorem, and displaying signals with their spectral plots.

No doubt some WP:AGF administrator will recognise the educational value and reinstate the references for the benefit of the public.Visionat (talk) 14:16, 24 April 2013 (UTC)

Visionat, the places for these discussions are Talk:Convolution theorem and Talk:Convolution, not your own talk page. Those pages are also where you should have suggested the addition of links to the software you have written, bearing in mind WP:COI: to avoid any interpretation that you are promoting your own software, you should suggest the addition of the links on the talk page and allow independent editors (not necessarily administrators) to decide whether the links are an asset to the articles. If you look at WP:COIU, you will see that an editor with a possible conflict of interest may "make edits where there is clear consensus on the talk page (though it is better to let someone else do it)". That's the way to go. It's nothing personal, so your talk page is not the place for the discussion. I hope that helps. PamD

Discrete convolution[edit]

The formulae are wrong in that the limits of m may be finite, not infinite. A good example of this is in Numerical smoothing and differentiation. Petergans (talk) 08:47, 5 May 2013 (UTC)

To get the discrete convolution of finite sequences, you must extend those sequences by zero outside their domain. In fact, it even seems to be necessary to do this in the formulae in the article that you linked to. Sławomir Biały (talk) 11:12, 5 May 2013 (UTC)
I have clarified the text of that article. The finite convolution formula there is now written explicitly as
(C \otimes y)_j\ = \sum_{i=-(m-1)/2}^{i=(m-1)/2} C_i\, y_{j-i},
m is a finite odd integer. Petergans (talk) 15:37, 5 May 2013 (UTC)
If the sequence y has length m, then you need to extend by zero whenever j-i>(m-1)/2. Sławomir Biały (talk) 18:45, 5 May 2013 (UTC)
No, y has length n, n>>m. The formula is the exact counterpart of the (continuous) definition, but with discrete values for i and finite limits for the summation. The number of calculated convolution points is equal to n-m+1, so this formula should only be used when n>>m. You can see an application in spectroscopic line shape#derivative spectroscopy: I used m=5 and n=100 to construct the diagram.Note that int(m/2) points at the start and end of the result are undefined; one could use zero extensions for those points, but that would be arbitrary and unnecessary. In the diagram only the central data from a wider dataset are shown, so that the undefined points are outside the plot. Petergans (talk) 08:54, 6 May 2013 (UTC)
Ok, but you're still going to run into trouble when j-i is out of bounds. It seems quite strange to argue that the present article is wrong by comparison with such a manifestly inadequate definition. Sławomir Biały (talk) 18:45, 7 May 2013 (UTC)
I gotta say, Sławomir Biały, you're very patient. Mct mht (talk) 19:06, 7 May 2013 (UTC)

The point is that j-i does not go out of bounds, because the formula is only to be applied when this quantity is in bounds. Numerical smoothing and differentiation has been used for many years, since Savizki and Golay published tables of convolution coefficients in 1964 It is accepted that the outermost points are undefined. Look at the way smoothing using a 5-point quadratic function is used at the start of the data, y.


\begin{pmatrix} {Y_3 }  \\ {Y_4 }    \\\end{pmatrix}=
\begin{pmatrix}
   { - 3/35} & {12/35} & {17/35} & {12/35} & { - 3/35} & 0  \\ 
   0 &{ - 3/35} & {12/35} & {17/35} & {12/35} & { - 3/35}  \\ 
    \end{pmatrix}
 \begin{pmatrix}{y_1 }  \\ {y_2 }  \\ {y_3 }  \\ {y_4 }  \\ {y_5 } \\ {y_6 } \\\end{pmatrix}.

The values of Y1 and Y2 are undefined. Similarly the last 2 points in the series will be undefined. The points which are undefined when using this formula can be obtained by using other formulae. User:Sławomir Biały, your objections are based on hypothetical circumstances which do not occur in practice because everyone should know, as you do, that they are invalid.. Petergans (talk) 11:32, 9 May 2013 (UTC)

"The point is that j-i does not go out of bounds, because the formula is only to be applied when this quantity is in bounds." — Then why does it matter so much how the convolution is defined, if the only cases you object to are never going to be applied? The mathematical definition of convolution certainly requires the group structure of the underlying space (in this case the group of integers). There is no applicable notion of "out of bounds" in the mathematical sense. Of course, which bounds are reasonable will depend on the application, but the same is true any time a mathematical device is used in numerical work. Also, I should add that the mathematical notion of convolution (even fairly sophisticated versions in locally compact groups) predate the source that you just cited by at least 15 years. Sławomir Biały (talk) 12:21, 9 May 2013 (UTC)
If I read the first displayed formula correctly, it's wrong, unless C has length m. The length of y is irrelevant. The formula with infinite limits is correct; if either f or g has finite support, the limits can be made finite. — Arthur Rubin (talk) 18:30, 9 May 2013 (UTC)
What I object to is the sentence "The convolution of two finite sequences is defined by extending the sequences to finitely supported functions on the set of integers." Extending the sequence is merely one way of dealing with the end-points which I describe above as undefined. I can give others if need be. The issue with infinite limits is a lack generality, particularly in view of the widespread use of the formula above when using Savitzki-Golay coefficients (yes, I know that the idea goes back into the 19th. century, it's a simple application of least-squares theory). It seems to me that there are two classes of application: f and g may be either functions or vectors. Let's make this explicit in the article and close this discussion. Agreed? Petergans (talk) 19:20, 9 May 2013 (UTC)
I can see your point about "is defined"; perhaps replace by "can be computed by", or replace the infinite sums by summing only over where the coeficients are defined. It makes no difference whether f is a function with finite support or a vector, and the fact that it makes no difference is important. Arthur Rubin (talk) 19:49, 9 May 2013 (UTC)
Having looked at the Savitzki-Golay article that you seem to think is the final authority on convolution, it's clear that the authors have confused the convolution with the cross-correlation. While this is not directly germane to the discussion, it does rather cast doubt on whether that can be considered a reliable source on Fourier analysis. Sławomir Biały (talk) 19:57, 9 May 2013 (UTC)

This is insulting. See my book "Data fitting in the chemical sciences", section 9.3, p. 202, "Convolution and cross-correlation". Petergans (talk) 10:10, 10 May 2013 (UTC)

I think folks are arguing a bit at cross purposes. There are different kinds of convolution being discussed here. For continuous domains, there is convolution defined on the real line and circular convolution defined on a finite segment of the real line; both allow for the convolution theorem to hold true. Taking discrete subsets, we have discrete convolution over the integers and circular discrete convolution over a finite range of integers. Suitably defined, both discrete types satisfy a version of the discrete convolution theorem. The convolution theorem is important both for pure math and for practical calculation. Fast convolution using the discrete Fourier transform is n\log(n) and straight multiplication and addition is n^2.
So given two finite sequences of length n and m with n\ge m, how do we convolve? First of all, mapping to discrete circular convolution, already being finite, is the way to go. If you want to use fast convolution, the discrete convolution theorem requires both sequences be of length n, so zero-padding is necessary here to extend the shorter sequence. If you want to convolve using simple multiplication and addition, you have two options. Either you can use zero padding, which makes the iteration logic simple. Or you can avoid zero padding and add extra conditionals to the iteration code to prevent access of out-of-bounds sequence elements. Convolution with finite sequences is nicely discussed in section 13.1 of Numerical Recipes in C, 2nd. ed.
It would be useful to elucidate these issues in the article. My recommendation would be to discuss convolution with finite sequences in the Circular discrete convolution section --Mark viking (talk) 22:48, 9 May 2013 (UTC)
No, that's not what Petergans is saying. His argument is ad hoc, mathematically unsophisticated and wrong. The mathematicians are saying that convolution is defined using the group structure of the underlying group. That dictates how you you convolve. When the group is the integers, then necessarily the elements of the group algebra are absolutely summable sequences. When the group is, say, cyclic abelian of finite order n, then the group algebra consists of finite sequences of length n and you get the "discrete circular convolution." Real line, same thing, you have L^1 functions. The convolution is defined the same way for any locally compact abelian group and the Fourier analysis goes through just the same. To say that convolution is undefined for sequences is non-sensical. Mct mht (talk) 23:16, 9 May 2013 (UTC)

Another insult! My argument is not "wrong". It concerns a practical application, which needs to be taken into account in the mathematics. I'm not saying that "convolution is undefined"; some points at the start and end of the convolved sequence are undefined.

(C \otimes y)_j\ = \sum_{i=-(m-1)/2}^{i=(m-1)/2} C_i\, y_{j-i} \qquad j= (m-1)/2+1,...,n-(m-1)/2

n is the length of the y-vector and, implicitly, m is an odd integer. Thus, for j=1,..,(m-1)/2 the convolved points are undefined, and similarly at the end of the sequence. In total m-1 points are undefined. As I mentioned before, values for the points that are undefined with this formula can be obtained by other means. Incidentally, on p209 of my book Fig. 9.8 shows part of the discrete Fourier transform of a vector of 15 quadratic/cubic smoothing coefficients, C. Petergans (talk) 10:10, 10 May 2013 (UTC)

I don't have access to your book, but if you're doing DFT, then the companion notion of convolution is actually the cyclic convolution (which should never be "undefined"). Sławomir Biały (talk) 10:47, 10 May 2013 (UTC)
With a 5-point function the above formula gives Y_1=C_{-2}y_{-1}+C_{-1}y_0+C_0 y_1+C_1y_2+C_2y_3 Y1 is undefined because y-1 and y0 do not exist (see example above). Circular convolution has nothing to do with this.Petergans (talk) 15:46, 10 May 2013 (UTC)
You're the one that brought up the discrete Fourier transform. If you really believe that the circular convolution has nothing to do with this, then you don't have a clue what you're talking about. Sławomir Biały (talk) 15:59, 10 May 2013 (UTC)
I don't have much time to comment, but I see what Petergans is up to, I think. The difference is between the values other than specified being undefined, and the values other than specified being assumed zero. The former (Petergans) has absoutely nothing to do with FFT or DFTs, though. — Arthur Rubin (talk) 16:31, 10 May 2013 (UTC)
I agree. Also agreed that this has nothing to do with circular discrete convolution or DFTs. As Mct mht nicely summarized above, mathematicians have a single concept of convolution based on abstract harmonic analysis on various topological groups or semigroups. But for other fields of endeavor, (data analysis, signal processing, etc.) convolution often has alternative meanings. Some of those alternative meanings may be considered ad hoc compared to the pure math concept, but they are in current use, discussed in reliable sources, and WP should cover them. As another example, consider image convolution in Gimp, an image processing program like Photoshop. In Gimp, there are different options for handling image borders in the convolution operation: extend, wrap, or crop. If I understand Petergans' example above, he advocates a 'crop' in his instance. 'Wrap' is more akin to mapping to discrete circular convolution. And I know of other conventions in image processing for handling convolution at the borders. Ideally, we cover it all. --Mark viking (talk) 17:55, 10 May 2013 (UTC)
There is an applications section for ad hoc applications such as these. I emphasize that just because some people do not define the convolution outside of some bounds does not make the definition in the article "wrong" as Peter Gans is insisting. Sławomir Biały (talk) 18:28, 10 May 2013 (UTC)

That's enough! After a third insult I quit this discussion. (Help:Wikipedia: The Missing Manual/Collaborating with Other Editors/Handling Incivility and Personal Attacks) Petergans (talk) 18:47, 10 May 2013 (UTC)


Time for a change?[edit]

It's a little misleading for the article to list these formulas before giving the definition in the general case. It gives and reinforces, as above exchange shows, various confusions: the domain is not clear, there are "different" convolutions, etc. Maybe that should be changed?

Convolution theorem actually embodies this problem: two "different" proofs for two special cases. Mct mht (talk) 04:38, 10 May 2013 (UTC)

I agree that taking the abstract harmonic analysis approach to convolution allows for a pretty unification of the different domains, as you nicely noted above. But from my experience, most electrical engineers, computer scientists, photographers, etc., don't know what a group is. And in the real world, signals are finite, symmetry is often broken, and periodicity is a convenient fiction to make the math easier. I think we would need to be careful not to scare away users with heavy math at the beginning of the article. --Mark viking (talk) 05:34, 10 May 2013 (UTC)
From Numerical Recipies, Pascal edition, p. 450, quote: Evidently, we have just described in words the following definition of discrete convolution with a response function of finite duration, M
(r*s)_j\ = \sum_{k=-(M-1)/2+1}^{M/2} s_{j-k} r_k \qquad (12.4.1) unquote
This definition conflicts with the definition given in the article
(f * g)[n]\ \stackrel{\mathrm{def}}{=}\ \sum_{m=-\infty}^\infty f[m]\, g[n - m]
because the summation limits are finite in one expression and infinite in the other. Petergans (talk) 09:28, 11 May 2013 (UTC)
I thought that Peter Gans had said he would leave this discussion! To stop these useless discussion, I have the following suggestion: change the title of this discrete section to "convolution on Z", put more math in it, like a discussion of the Wiener algebra A(T) and perhaps some convolution on finite cyclic groups. Create somewhere else a section about practical use of convolution. I like the idea of User:Mark viking to describe the problems at the boundary with Gimp, and the ad hoc solutions available. One last comment: this article starts with "in MATHEMATICS"... Bdmy (talk) 09:41, 11 May 2013 (UTC)

Unacceptable reversion[edit]

The reversion by user:Bdmy is unacceptable. The title of this article is Convolution, and the formula fits the general definition (Numerical Recipes, p. 450).

This formula must be included because it covers a range of applications. For example, in numerical smoothing and differentiation the term convolution is linked to this article, so it is essential that a reader, using the link from that article, will find a consistent definition here. Petergans (talk) 13:19, 14 May 2013 (UTC)

Your proposed edit is inconsistent with the source you cited, where f is explicitly an infinite signal and g is a finite impulse response. Moreover, the formula you gave
(f*g)_n=\sum_{m=1}^{M} f_{n- \left(\frac {M+1}{2}-m\right)}g_{m} \qquad \frac {M+1}{2} \le n \le  N-\frac {M-1}{2}
is just wrong. In particular, it doesn't appear in any of the sources cited in the article, nor indeed any sources that I am familiar with. Sławomir Biały (talk) 13:28, 14 May 2013 (UTC)

By good fortune I still have a reprint of the SG paper (Savitzky, A.; Golay, M.J.E. (1964). "Smoothing and Differentiation of Data by Simplified Least Squares Procedures". Analytical Chemistry 36 (8): 1627–1639. doi:10.1021/ac60214a047. ). The formula they give is

 Y_j*=\frac{\sum_{i=-m}^{i=m}C_i Y_{j+i}}{N}

Here, m=(M-1)/2. I wanted to avoid the plus sign (j+i) which makes it look as though it is cross-correlation (SG explicitly state that it is convolution) also to use normal vector indexing from 1 to M. My formula above is exacly equivalent, differing only in indexing. Also it makes the limits of applicability clearer. Put the SG formula in, with citation, if you prefer. Petergans (talk) 15:37, 14 May 2013 (UTC)

"Indexing" is of course quite relevant to the mathematical notion of convolution, as well as the present context of the article. I've already given my opinion of the suitability of the SG reference, but you didn't seem to be interested in discussing it then. In fact, you specifically withdrew from further discussion. So your enjoinder now to participate in "rational discussion" seems disingenuous. WP:STICK seems to be getting increasingly relevant. Sławomir Biały (talk) 16:57, 14 May 2013 (UTC)
It is not the convolution. It would only be the convolution for symmetric vector C. The problem here is that you are suggesting that a very niche application of convolution should be given far more weight in the article than other applications, so much so in fact that you are insisting that the very definition must be made to conform to this very uncommon one that doesn't happen to agree with anyone else's. See WP:WEIGHT for why this is not consistent with our policies. Sławomir Biały (talk) 12:47, 15 May 2013 (UTC)

Once again my contribution has been reverted. This is appalling.

  1. The process is convolution. This clearly stated in the introduction of Savitzky and Golay's paper
  2. The vector C need not be symmetric. Unsymmetrical examples can be seen at Savitzky–Golay smoothing filter, Table of convolution coefficients for 1st derivative
  3. The term "niche" is a point of view, not supported by the fact that "Savitzky and Golay's paper is one of the most widely cited papers in the journal Analytical Chemistry"
  4. The formula is properly sourced in the scientici literature

Petergans (talk) 14:39, 15 May 2013 (UTC)

Peter, see WP:BRD. You were bold, but then were reverted. Now you must discuss and persuade others. It is not appropriate for you to continue inserting contentious material against consensus. Sławomir Biały (talk) 15:32, 15 May 2013 (UTC)
In order to cool down the debate, I propose the following free-access reference for study to the debaters
http://www-inst.eecs.berkeley.edu/~ee123/fa12/docs/SGFilter.pdf
I don't have time to elaborate on it right now, but I will be back. Just two remarks: the interest of the SG filter has little to do with the fact that one can, or cannot, write it FORMALLY EXACTLY as f*g. Second, the SG paper is somewhat known in the strict math. literature: around 6 occurrences in references of papers listed in Math. Reviews.
Please have a look to the above link, I think it is informative, reasonable and looks mathematically correct. Bdmy (talk) 07:24, 16 May 2013 (UTC) (Bdmy (talk) 07:33, 16 May 2013 (UTC))
Thank you for this reference. It certainly confirms that SG-filtering is a convolution process. Most of the material in the pdf is covered, with detailed examples, in my book, "Data Fitting in the Chemical Sciences", though I only illustrated frequency response with a single example. I also created the article Numerical smoothing and differentiation, which has lots of detail. The use of a SG-filter for numerical differentiation is illustrated in another article that I created, Spectroscopic line shape.
The only question that remains is how best to illustrate it the mathematically. My last attempt used the formula published by SG because this can be directly applied to practical examples, whereas the formula given in the pdf is more abstract and does not give the limits of applicability, which are important in practical applications. If you agree with this, please restore my last edit. Petergans (talk) 09:28, 16 May 2013 (UTC)
Coming along to this as far as I can see what you are trying to stick in is in the plus form normally used for a cross-correlation rather than with a minus for a convolution. They do the same sorts of things but confusing the two in an article like this is not I feel in the least helpful. The paper by Savitzky and Golay seems to be an original practical paper for chemists where they implemented an algorithm in FORTRAN and what they did was convolution okay. The basic idea is a good one, however the way they wrote it down does not correspond with usage nowadays in mathematics computer science or electronics. The only real question I can see is whether the way round they put things has enough weight to be mentioned here. Personally I don't see that it does except perhaps as a warning note about the confusion in some areas. Dmcq (talk) 10:01, 16 May 2013 (UTC)
It is certainly relevant to point out that the article presently already contains precisely the formula of the SG filter as stated on the second page of Bdmy's reference. The only issue is the applicable bounds, which depend on the specific application and so are not appropriate for a general purpose article on convolution (but would be appropriate for the main article on the SG filter). The present formula in the article was already arrived at as a compromise with User:Petergans, who had added a formula together with a reference that actually didn't support the formula as written. This was then corrected by Arthur Rubin, who made the unfortunate edit summary "statement, as written, is wrong, and I don't want to repair it." Peter Gans latched onto this summary, saying "If it's wrong, correct it. And, don't remove citations". So I came along and corrected it, from the very source that Peter had used to support the statement in question. This compromise, however, did not seem to satisfy Peter, since the formula in our article no longer agreed exactly with the way convolution is used in numerical smoothing and differentiation (see discussion above why of course bounds are relevant in ad hoc applications, but that doesn't mean that the mathematical definition should dwell on things like this). So Peter added at various points to the section this rather questionable formula. Now he adds the SG filter, which is the cross-correlation rather than the convolution, and really doesn't fit at all with the article. Even if, as Dmcq says, it is possible to coax this thing into a convolution, it really doesn't belong in a general article on convolution. In fact, the article that Bdmy links clearly conveys that this application of convolution is just that: an application of convolution (one that is not even all that well known to experts in the subject). Specializing a general article on one particular application of something is not appropriate, for reasons that I have already said. Please, if you want the article to cover this application, then write a paragraph in the applications sections that links to the respective articles on numerical smoothing and the SG filter. To me, that would be an acceptable way to deal with this disagreement. Sławomir Biały (talk) 12:11, 16 May 2013 (UTC)
I was editing at the same time as User:Sławomir Biały, what follows does not take his above contribution into account.
I continue with two remarks and a proposition:
  • When the reference above, that I Googled at random, mentions "discrete convolution", it writes a formula identical to that of User:Sławomir Biały.
  • Why insist on putting a convolution sign " * " in the equation
(f*g)_n=\sum_{m=-(M-1)/2}^{(M-1)/2} f_{n+m}g_{m} \qquad \frac {M+1}{2} \le n \le N-\frac {M-1}{2}
contrary to the ubiquitous mathematical usage, and where the coefficients g(j) are unknown, so, they do not give (cannot give/should not give, in this general mathematical article) a valuable account of what S-G is really about?
  • I propose to elaborate on the following scheme:
The Savitzky-Golay algorithm[1] is important[2] for smooting data. It is based on discrete convolution and polynomial approximation. It is mainly used in the fields of...
  1. ^ S-G...
  2. ^ According to... it is the xth most viewed... also appears in mathematical articles...
with NO (in my opinion) useless and (obviously) controversial displayed equation.
Bdmy (talk) 12:31, 16 May 2013 (UTC) Bdmy (talk) 13:31, 16 May 2013 (UTC)


Discrete convolution(2)[edit]

I propose the following insertion. It answers all the concerns expressed above and is verifiable.

For vectors y of length N and C of length M (odd), the following summation is used, for example, in smoothing and differentiation of digital data:[1]

(C*y)_n=Y_n=\sum_{m=-(M-1)/2}^{(M-1)/2} C_{m}y_{n+m} \qquad \frac {M+1}{2} \le n \le  N-\frac {M-1}{2}

For illustration, with M = 5,N = 7 the formula gives (using matrix notation)


\begin{pmatrix} {Y_3 }  \\ {Y_4 }  \\  {Y_5 }  \\\end{pmatrix}=
\begin{pmatrix}
   C_{-2} & C_{-1} & C_{0} & C_{1} & C_{2} & &  \\ 
   & C_{-2} & C_{-1} & C_{0} & C_{1} & C_{2}  \\ 
   & & C_{-2} & C_{-1} & C_{0} & C_{1} & C_{2}  \\ 
 \end{pmatrix}
 \begin{pmatrix}{y_1 }  \\ {y_2 }  \\ {y_3 }  \\ {y_4 }  \\ {y_5 }  \\{y_6 }  \\ {y_7 }  \\\end{pmatrix}.
  1. ^ Savitzky, A.; Golay, M.J.E. (1964). "Smoothing and Differentiation of Data by Simplified Least Squares Procedures". Analytical Chemistry 36 (8): 1627–1639. doi:10.1021/ac60214a047. 

Petergans (talk) 07:19, 17 May 2013 (UTC)

No, this insertion does not answer the concerns. It uses without a warning the same convolution sign " * " used everywhere else in this article with a different meaning. Also, you did not answer my question: WHY INSIST SO MUCH AND SO LONG to have this very notation here, conflicting with the rest of the article? This concern has been also expressed by User:Dmcq.
Bdmy (talk) 07:46, 17 May 2013 (UTC) Bdmy (talk) 08:48, 17 May 2013 (UTC)
And for a discrete convolution in the usual form elsewhere you'd need to swap Ci with Ci and those index bounds are silly because it is perfectly okay for M to be an even number. There is nothing stopping people smoothing by taking the average of two adjacent points giving a point which appears half way between them for instance. You're just talking about implementation details in a persons computer program, and they were just trying to do something practical rather than interested in convolution as such. Dmcq (talk) 08:35, 17 May 2013 (UTC)
One can actually see the implementation creeping out in that \operatorname{min}(n+m) = \frac{M+1}{2}+\left(-\frac{M-1}{2}\right) = 1, the default lower bound in FORTRAN. Dmcq (talk) 09:38, 17 May 2013 (UTC)
This is truly a convolution. Don't just take my word for it. It's in the SG paper. The purpose of the illustration is to show how the convoluting function C "slides" through the data vector y, just as it should do in a convolution process. The indexing may seem peculiar, but it's convenient and is the indexing used in the source, the SG paper, so its verifiable. It is indeed perfectly OK for M to be an even number. The reason for preferring M odd is that then Yn can replace yn, which is convenient in practice (otherwise the Y points would be between the y points on the abscissa) . Your point about \operatorname{min}(n+m) = \frac{M+1}{2}+\left(-\frac{M-1}{2}\right) = 1 is also a good one. In an earlier version I changed the summation indexing to m= 1,..,M, but that was reverted.
This is a Wikipedia issue, not a maths issue. I insist in order to make links from other Wikipedia articles more meaningful. Have you seen how many articles link to this one?
Let's have no more negative criticism. Either come up with something that is more acceptable, or accept the present proposal. Petergans (talk) 10:19, 17 May 2013 (UTC)
Your conclusion is extraordinary: nobody agrees with you, so we should accept this insertion. I am ready to believe that no "rationale debate" is possible with you.
Bdmy (talk) 10:48, 17 May 2013 (UTC)
You should document in the smoothing and differentiation article that the mathematical convention for indices when talking about convolution is the opposite of the one used by Abraham Savitzky and Marcel J. E. Golay.. The convention there has too little WP:WEIGHT in this context. You really should not be pushing some fifty year old implementation in FORTRAN in a concept article. Dmcq (talk) 11:17, 17 May 2013 (UTC)
I believe that I've already indicated what would be acceptable: a paragraph in the Applications section that links to the articles on the SG filter and numerical smoothing and differentiation. Implementation details should be left out of this paragraph, and instead contained in the main articles where it is relevant. Sławomir Biały (talk) 11:20, 17 May 2013 (UTC)

Have it your way. The section is incomplete because it lacks a formula for the convolution of two vectors. Petergans (talk) 13:10, 17 May 2013 (UTC)

You do realize there are a lot of other smoothing filters besides the Savitzky–Golay one?Dmcq (talk) 14:14, 17 May 2013 (UTC)
I am curious to see a relevant mathematical source that defines convolution of VECTORS in say, Rn. I believe more in a source like Tichmarsch than Savitzky-Golay, when it is about writing a general article about the mathematical notion of convolution. I am really amazed that Petergans seems unable to get this point. For me, if the section is incomplete, it is because it does not mention Zn for example and other discrete groups. I do not see what benefit for this article in adjoining to this section material that does not add any mathematical content to the formula already given in the case where one sequence is finitely supported. Bdmy (talk) 14:39, 17 May 2013 (UTC)
A last try before I give up. Peter, is it possible that you do not see that you are trying to put an incredible emphasis on a formula which is worth nothing, not just in mathematical terms, if it is not supported by the much more important part of the S-G algorithm that is to follow, and that does not fit here? If Wikipedia in general is concerned, what benefit can a general reader get from knowing that you define, as you do, new coefficients (C * y)_n, where the C_j are unknown coefficients? What difference with just having the usual formula for convolution, where he/she has just to change the unknown coefficients C_j into the just as unknown coefficients C_{-j}? Will you go on for ever writing:
but it is written in S-G!!! but it is written in S-G!!! ...
Bdmy (talk) 17:33, 17 May 2013 (UTC)
Maybe convolution with vectors is not discussed in some mathematical texts. It is, nevertheless, frequently used in that context in the scientific literature. It's not that the SG formula is important in itself - originally I only cited it as an example. Actually, I believe that smoothing is largely a cosmetric operation; least-squares procedures are generally better for dealing with noisy data. For me the real value of SG filters is that they provide a simple method for doing numerical differentiation, so open a convenient pathway into derivative spectroscopy. Other areas of application of numerical derivatives of experimental data, extensively used, are automated peak finding (zeros in odd derivatives) and end-point determination (location of inflection points), mentioned in numerical smoothing and differentiation#applications. The theory behind these application should be outlined in a maths article and on that basis the section on discrete convolution is incomplete. Petergans (talk) 19:39, 17 May 2013 (UTC)
As I asked before, what is, in the literature, the difference between convolution of vectors and convolution of function on Z with finite support? If there is no essential difference, then the existing "Discrete convolution" section needs no further comment. — Arthur Rubin (talk) 20:26, 17 May 2013 (UTC)
One just has to choose whether to embed the vector into the group algebra of Z or some Zn. I agree that the section needs no further comment. Mct mht (talk) 21:04, 17 May 2013 (UTC)
I would dispute that the least squares algorithm with polynomials is the best way of smoothing. S-G have been quite careful to choose a degree and width so the coefficients go down the further away from the central point but that does not happen in general. Basically the problem is that it doesn't weight the centre more than the ends and polynomials are nasty at the ends. I would consider for instance a straightforward low pass filter far better in general. Notice the way it wiggles getting smaller and smaller at the ends. You can do differentiation with any of the methods without any bother. I am not totally impressed by you quoting articles you largely wrote. Dmcq (talk) 22:58, 17 May 2013 (UTC)

Domain of definition[edit]

This section does not cover discrete convolution. Petergans (talk) 07:35, 18 May 2013 (UTC)

I see Mark Viking has put in a link to a description of the method you are trying to push here. I've no problems with what they've done. Do you disagree with that paper? In particular equation (4)? Dmcq (talk) 10:05, 18 May 2013 (UTC)
There's a bit less to say about the domain of definition of the discrete convolution in Z. The typical application is for absolutely summable sequences, although it can be defined in other \ell^p spaces. I've added some discussion of this to the section on the discrete convolution. Sławomir Biały (talk) 11:53, 18 May 2013 (UTC)
It looks a bit funny because practically the same condition is in there twice, under the discrete convolution for the sequences and once under Domain of definition for the integrable functions. The same thing happens with Young's inequality in there twice for measures. Dmcq (talk) 12:22, 18 May 2013 (UTC)
Yes, and it probably should appear yet again in the section on groups. In some sense, this is in increasing level of mathematical sophistication. It could probably be made more harmonious though. I've removed it for now. It's probably best not to dwell on these things. Sławomir Biały (talk) 13:06, 18 May 2013 (UTC)

Applications[edit]

The sentence "They are valued for smoothing spectroscopic curves wile preserving the shape and phase of the peaks" is rubbish. See Numerical smoothing and differentiation#Signal distortion and noise reduction for details. Petergans (talk) 08:58, 25 May 2013 (UTC)

That is cited to a paper published in IEEE Signal Processing magazine and written by a person with some fairly impressive credentials in signal processing. Perhaps you could point to a better source and explain what it says different thanks. Dmcq (talk) 09:38, 25 May 2013 (UTC)
It's rubbish because spectroscopic curves don't have phase. The statement applies only to those electrical signals that do have phase. It lacks generality and has been quoted out of context. Petergans (talk) 18:53, 25 May 2013 (UTC)
You're right; spectroscopic curves don't have phase. Only in case of laser (or equivalent) illumination or in active RF "spectroscopy" are there phases. I'll have to read the paper to determine the signal processing context. — Arthur Rubin (talk) 19:16, 25 May 2013 (UTC)
Thanks for your comments. But you are confusing how spectrograms are produced with how they are filtered. Power spectrograms are the modulus squared of, for instance, the short-time Fourier transform of a signal and so do not retain the phase of the original signal. But the spectrogram is also a real valued integrable function and as such can be decomposed into Fourier components, with each component having its own phase. An S-G filter is an FIR filter, and like any filter, can potentially alter the phases of those components. Symmetric S-G filters, however, have zero phase and so do not alter the phases of those components. I apologize that my summary in the article was a bit too terse and thus confusing, but in filtering a spectrogram function, the concept of phase definitely exists. If you are looking for the source of "while preserving the shape and height of the peaks", that is on page 111 of the cited reference, top of the second column. The assertion "Symmetric S-G filters have zero phase" is on page 116, first column. --Mark viking (talk) 20:07, 25 May 2013 (UTC)
I just had a another look and it looks like I didn't check the words here against there properly. I just grabbed words from here and then looked for something that corresponded there and found "Savitzky and Golay were interested in smoothing noisy data obtained from chemical spectrum analyzers, and they demonstrated that least squares smoothing reduces noise while maintaining the shape and height of waveform peaks (in their case, Gaussian shaped spectral peaks)" which seemed fine to me. I think maybe I read preserving the phase of the peaks as simply meaning they weren't shifted but it would be simpler to just say that. Dmcq (talk) 21:14, 25 May 2013 (UTC)
Unfortunately, "maintaining the shape and height" is not, strictly speaking, correct. It is an approximation that is useful iff the convoluting function is chosen carefully. The SG filter has the following conservation properties. i) peak positions ii) peak symmetry iii} peak area. There are proofs in Appendix 7 of my book. Conservation of area (integral of convoluted function) is what is relevant here. If a SG filter is an integrable function then convolution#integration also proves it. Smoothing of a Gaussian, for example, reduces the height and increases the half-width. "Maintaining shape" is ambiguous, it could be taken to mean either height and half-width, or shape function. I don't know about conservation of shape function. The literature on the SG filter is extensive , so maybe it has been proved. If so, a citation would be needed. What is certain is that the height and half-width of the original shape function are not conserved. Petergans (talk) 08:06, 26 May 2013 (UTC)
I took it as meaning a tendency rather than an absolute. The height of a peak tends to be maintained for instance compared to a blurring filter like an average or compared to an edge detection filter. I'm not sure what you mean by conserving peak area over and above that the coefficients add to 1 or peak position given that some peaks will disappear - I guess you have some special definition of those. Dmcq (talk) 10:14, 26 May 2013 (UTC)
On further thought, SG-smoothing does not conserve shape function. In the FT co-domain the SG-smoothing filter is a sum of cosines, so whatever the FT of the shape function its form will not be conserved when mutiplied by the SG-FT. Let's have some clear and unambiguous wording, please. Petergans (talk) 10:48, 26 May 2013 (UTC)
If the points follow a low degree polynomial the shape will be conserved. Dmcq (talk) 11:35, 26 May 2013 (UTC)

Applied mathematics[edit]

I have now finished editing Savitzky–Golay filter for smoothing and differentiation. Please have a look at it.

The theory in this article (Convolution) is all about "pure" mathematics and has inadequate background on an important application such as the S–G filter. For example, on what basis can convolution be used to obtain derivatives of a function? I ask that you guys look again at the "applied" mathematics aspects of convolution. The S-G filter deserves to have a proper background treatment here. Petergans (talk) 08:27, 1 August 2013 (UTC)

Differentiation: boundary term?[edit]

I think that the formula for differentiation is wrong and misses the boundary term:

\frac{d}{dx}(f * g)(x) = (f'  * g )(x) + f(0) g(x) = (f * g')(x) + f(x) g(0)

I'm aware that the term missing in MathWorld as well. But just quoting from somewhere is no proof.

When taking the derivative, there is an "x" at the integral boundaries and at the integrands, which makes two terms. One may also compute the derivative explicitly from infinitesimal differences. Third way is to go via a Laplace transform, where the boundary terms show up explicitly:

\mathcal{L}_s[h'] = \int_0^\infty e^{-s x} h'(x) dx  = -h(0) - s \hat h(s)

\mathcal{L}_s[(f * g)'] = -(f * g)(0) - s\widehat{f * g}(s) 
= - s\hat f(s) \hat g(s)
= \mathcal{L}_s[f'] \hat g(s) + f(0) g(s) 
= \mathcal{L}_s[(f'*g)(x) + f(0) g(x)]
since  (f * g)(0) = 0 . — Preceding unsigned comment added by Felix0411 (talkcontribs) 15:11, 3 December 2013 (UTC)

There is no boundary term. The reason you're getting one is that you aren't taking the convolution of f and g. You've smuggled a heaviside function into the mix (notice that the lower limit of integration is zero, not negative infinity). Sławomir Biały (talk) 15:52, 3 December 2013 (UTC)
Upps, I didn't notice that. My version comes up naturally if f,g are response functions respecting causality. As far as I know the one-sided definition is called a "convolution" as well, sometimes with the attribute "one-sided". Should the article warn the reader about the different definitions?Felix0411 (talk) 16:42, 4 December 2013 (UTC)
It seems like a good idea to mention the one-sided convolution as a variation of the standard (two-sided) one. I wouldn't phrase it as a warning to the reader though. Sławomir Biały (talk) 01:23, 6 December 2013 (UTC)