# Formal derivative

In mathematics, the formal derivative is an operation on elements of a polynomial ring or a ring of formal power series that mimics the form of the derivative from calculus. Though they appear similar, the algebraic advantage of a formal derivative is that it does not rely on the notion of a limit, which is in general impossible to define for a ring. Many of the properties of the derivative are true of the formal derivative, but some, especially those that make numerical statements, are not. The primary use of formal differentiation in algebra is to test for multiple roots of a polynomial.

## Definition

The definition of formal derivative is as follows: fix a ring R (not necessarily commutative) and let A = R[x] be the ring of polynomials over R. Then the formal derivative is an operation on elements of A, where if

$f(x)\,=\,a_n x^n + \cdots + a_1 x + a_0$

then its formal derivative is

$f'(x)\,=\,Df(x) = n a_n x^{n - 1} + \cdots + 2 a_2 x + a_1$

just as for polynomials over the real or complex numbers.

Note that $n a_n$ does not mean multiplication in the ring, but rather $\sum_{k=1}^n a_n,$ where $k$ is never used inside the sum.

One should mention that there is a problem with this definition for noncommutative rings. The formula itself is correct, but there is no standard form of a polynomial. Therefore using this definition it's difficult to prove $(f(x)\cdot b)'=f'(x)\cdot b$.

## Alternative definition well suited for noncommutative rings

Let for $r\in R$ holds $r'=0,$ let $x'=1.$ Let us define derivative for expressions, such that $(a+b)'=a'+b',$ and $(a\cdot b)'=a'\cdot b+a\cdot b'.$

We must prove that this definition gives the same result for an expression independent on the method the expression was evaluated, therefore that it is compatible with the axioms of equality.

• $(a+b)'=a'+b'=b'+a'=(b+a)',$
• $((a+b)+c)'=(a+b)'+c'=(a'+b')+c'=a'+(b'+c')=a'+(b+c)'=(a+(b+c))',$
• $(a(bc))'=a'(bc)+a(bc)'=a'bc+a(b'c+bc')=a'bc+ab'c+abc'=$
$=(a'b+ab')c+(ab)c'=(ab)'c+(ab)c'=((ab)c)',$
• $((a+b)c)'=(a+b)'c+(a+b)c'=(a'+b')c+(a+b)c'=(a'c+b'c)+(ac'+bc')=$
$=\cdots=(a'c+(b'c+ac'))+bc'=(a'c+(ac'+b'c))+bc'=\cdots=(a'c+ac')+(b'c+bc')=$
$=(ac)'+(bc)'=(ac+bc)',$

and the distributivity from the other side from symmetry.

Linearity naturally follows from the definition.

Formula for derivative of a polynomial (in standard shape for commutative rings) is direct consequence of the definition: $(\sum_i a_ix^i)'=\sum_i (a_ix^i)'=\sum_i ((a_i)'x^i+a_i(x^i)')=\sum_i(0x^i+a_i(\sum_{j=1}^ix^{j-1}(x')x^{i-j}))=\sum_i\sum_{j=1}^i a_ix^{i-1}.$

## Properties

It can be verified that:

• Formal differentiation is linear: for any two polynomials f(x), g(x) and elements r, s of R, we have
$(r \cdot f + s \cdot g)'(x) = r \cdot f'(x) + s \cdot g'(x).$
When R is not commutative there is another, different linearity property in which r and s appear on the right rather than on the left. When R does not contain an identity element then neither of these reduces to the case of simply a sum of polynomials or the sum of a polynomial with a multiple of another polynomial, which must also be included as a "linearity" property.
• The formal derivative satisfies the Leibniz rule, or product rule:
$(f \cdot g)'(x) = f'(x) \cdot g(x) + f(x) \cdot g'(x).$
Note the order of the factors; when R is not commutative this is important.

These two properties make D a derivation on A (see also module of relative differential forms for a discussion of a generalization).

## Application to finding repeated factors

As in calculus, the derivative detects multiple roots: if R is a field then R[x] is a Euclidean domain, and in this situation we can define multiplicity of roots; namely, for every polynomial f(x) and every element r of R, there exists a nonnegative integer mr and a polynomial g(x) such that

$f(x) = (x - r)^{m_r} g(x)$

where g(r) is not equal to 0. mr is the multiplicity of r as a root of f. It follows from the Leibniz rule that in this situation, mr is also the number of differentiations that must be performed on f(x) before r is not a root of the resulting polynomial. The utility of this observation is that although in general not every polynomial of degree n in R[x] has n roots counting multiplicity (this is the maximum, by the above theorem), we may pass to field extensions in which this is true (namely, algebraic closures). Once we do, we may uncover a multiple root that was not a root at all simply over R. For example, if R is the field with three elements, the polynomial

$f(x)\,=\,x^6 + 1$

has no roots in R; however, its formal derivative is zero since 3 = 0 in R and in any extension of R, so when we pass to the algebraic closure it has a multiple root that could not have been detected by factorization in R itself. Thus, formal differentiation allows an effective notion of multiplicity. This is important in Galois theory, where the distinction is made between separable field extensions (defined by polynomials with no multiple roots) and inseparable ones.

## Correspondence to analytic derivative

When the ring R of scalars is commutative, there is an alternative and equivalent definition of the formal derivative, which resembles the one seen in differential calculus. The element Y-X of the ring R[X,Y] divides Yn - Xn for any nonnegative integer n, and therefore divides f(Y) - f(X) for any polynomial f in one indeterminate. If we denote the quotient (in R[X,Y]) by g:

$g(X,Y) = \frac{f(Y) - f(X)}{Y - X}$

then it is not hard to verify that g(X,X) (in R[X]) coincides with the formal derivative of f as it was defined above.

This formulation of the derivative works equally well for a formal power series, assuming only that the ring of scalars is commutative.

Actually, if the division in this definition is carried out in the class of functions of $Y$ continuous at $X$, it will recapture the classical definition of the derivative. If it is carried out in the class of functions continuous in both $X$ and $Y$, we get uniform differentiability, and our function $f$ will be continuously differentiable. Likewise, by choosing different classes of functions (say, the Lipschitz class), we get different flavors of differentiability. This way differentiation becomes a part of algebra of functions.