Jump to content

User:Hypercomplex Automatic Differentiation/sandbox

From Wikipedia, the free encyclopedia

The Complex Taylor Series Expansion Method (CTSE)

[edit]

The starting point to learning HYPAD (HYPercomplex Automatic Differentiation) is the Complex Taylor Series Expansion method (CTSE), also called “complex step”. This method is convenient because it uses only complex variables, complex algebra, and complex functions contained in any modern computer language. Therefore, one can immediately compute numerical derivatives using whatever language is preferred.

CTSE is an ingenious numerical method that can be used to compute highly accurate numerical derivatives. The essence of CTSE can be seen in the simple one dimensional Taylor series expansion of a complex-valued function

Eq. 1

where denotes the first derivative of with respect to , the second derivative, and the 'th derivative. Evaluating the series with the complex number , and the real number , then , where is the step size. As a result,

Eq. 2

We see that the terms within the complex Taylor series alternate between real and imaginary. We can sort this equation into real terms and imaginary terms as

Eq. 3

Eq. 4

where denotes “Higher Order Terms”. Equation 3 shows that the imaginary terms divided by can be written as

Eq. 5

As a result, the first imaginary term contains the first derivative times , and the first derivative can be approximated using the operation

Eq. 6

Equation 6 is not exact; however, as shown below, the terms of can be made to be numerically insignificant.

The key to CTSE is that it does not involve subtractive error, that is, it does not involve the subtraction of terms. As a result, the step size can be made very small. A small step size is valuable as it reduces truncation error by driving the higher order terms, represented by , below machine precision. The end result is that, although #Eq. 3 is not exact, it can be used to compute highly accurate derivatives.

A small is advantageous but what constitutes a sufficiently small ? A numerical study shown below suggests that one should use a value of or smaller. However the size of the parameter being perturbed should also be considered. In other words, for example, if the units of the parameter of interest is , CTSE will not provide accurate results using a value of . Hence as a practical solution, should be at most times the parameter of interest, but it can be smaller. Since the smallest number that can be represented with double precision arithmetic is , it follows that the perturbation applied must be greater than this value to prevent underflow. More practically, the variable of interest and its perturbation must be greater than . To summarize, there is a huge swath of numbers that can be used for the step size to obtain machine precision accuracy. A practical solution should be at least times the parameter of interest. The suggested limits for are

An additional aspect to CTSE is its effects on the real portion of . As shown in equation 4, the real results get “polluted” by terms of order . However, as stated previously, these terms are made numerically insignificant through the use of a small . As a result, we numerically obtain the same real-valued result for as for .

In summary, if we evaluate using a sufficiently small step size instead of only , we obtain real result and .

1 Analogy with the finite difference method

[edit]

CTSE has a direct analogy with the well-known finite difference (FD) method for estimating a derivative using a numerical approximation. Figure 1 shows a graphical representation comparing FD and CTSE. In this figure, the first order derivative of a function with respect to parameter is determined numerically by both approaches, FD by perturbing a parameter of interest along the real axis () by a step size , and CTSE by perturbing along the imaginary axis () by a step size . An estimate for the derivative using a forward finite difference approach is

Eq. 7

In an analogous manner, CTSE perturbs the parameter of interest by a step size but along the imaginary axis. That is, real-variable parameter of interest is made complex-variable and its imaginary component is given a value of . The derivative is then estimated as shown in equation 6

Eq. 8

Both approaches are numerical estimates, however, as will be shown, CTSE provides far superior results.

As we shall see, , that is, the step size used to perturb the function for CTSE () is many orders of magnitude smaller than the step size used in the finite difference method.

There is one major algorithmic difference between FD and CTSE. Comparing equations 7 and 8, notice that the estimate of the derivative using FD is the difference between two results, whereas CTSE has no differencing operation. As a result, FD has subtractive error if is too small. Conversely, truncation error results if is too large. Consequently, there is a sweet spot for the step size, i.e. cannot be too large nor can it be too small - the bounds on the step size are two sided. In addition, a satisfactory step size is parameter and problem dependent - what works in one situation will not necessarily work in another. Finally, there usually is no way to know when a good step size has been reached.

CTSE also suffers from truncation error but not subtractive error - this lack of subtractive error is the significant difference between the two methods and the reason why CTSE is numerically superior. Hence, the bounds on the step size for CTSE are one sided - can be too large but it cannot be too small (within the representation of a real number by the computer). As a result, the step size can easily be made sufficiently small such that truncation error is eliminated without suffering from subtractive error. In summary, both FD and CTSE suffer from truncation error - a large step size leads to errors; however, only FD suffers from subtractive error - a small step size leads to errors.

The one drawback to CTSE is that the function must be complex valued and the intrinsic real variables within the computer model must become defined as complex variable. Hence CTSE is an intrusive method - the source code must be changed; however, the change is simple in concept. Often, this leads to a conceptual challenge - what is the physical meaning of an imaginary displacement, force, stress, strain, or pressure, as examples? The answer is that the imaginary component is merely a mathematical convenience for computing the derivative and of no physical significance.

Figure 1: FD & CTSE perturbation

As an aside, you may be wondering if a negative step is useful in CTSE? A negative step in finite differencing results in the “backwards difference” method, which sometimes has advantages. That is, when the step sizes are of significant size, as in the finite difference method, the a backwards step will sometimes yield an improved estimate over the forward step. However, in the case of CTSE, although a negative step is certainly possible, there no value in adopting a negative step since the step size is so small, that is, forwards and backwards step results yield no difference. Hence, a forward step will be used for all examples. However, the mathematical equations for a backwards step are listed below.

The Taylor series for a negative step is

Eq. 9

Therefore, the derivative can be determined from the equation

Eq. 10

Although the real power of CTSE lies in its application to sophisticated numerical models such as large finite element analyses, it is instructive to observe how it operates on several common functions. Therefore, analytical examples are provided followed by numerical examples. In all cases, the numerical result is within machine precision. Of course, since all computer languages contain complex numbers, the user can easily verify CTSE on these functions and many others.

2 Closed-form examples of elementary functions

[edit]

It is instructive to apply CTSE analytically to elementary functions to assess their behavior as . In these examples, it is straightforward to determine the behavior as ; however, many computer languages have a “Limit” operator such that the limit as can be implemented in software. With each example is a figure that is plot of the real and imaginary portions of the function. It is also shown that CTSE automatically accounts for the product and chain rules from calculus. These rules do not have to be explicitly implemented.

Example 2.1:

[edit]

Consider the simple function . Then

We see in that for this example CTSE is exact as there are no higher order terms with respect to .

Figure 2: Graphical derivative of and

Example 2.2:

[edit]

Consider the function . Then

Here the derivative is exact to within . However, the exact derivative is recovered as . In effect,

Figure 3: Graphical derivative of and

Example 2.3:

[edit]

Consider . Then

Eq. 11

As shown in Eq. 11, CTSE is not exact but as , and the numerical derivative is recovered as approaches zero. In effect,

Figure 4: Graphical derivative of and

Example 2.4:

[edit]

Consider . Using the identity for the sum of two angles

we can express the sine of a complex number as

using , and

Then

As , and the numerical derivative is recovered as approaches zero. In effect,

Figure 5: Graphical derivative of and

Example 2.5:

[edit]

Consider . Using the identity for the sum of two angles

we can express the cosine of a complex number as

Then

As . In effect,

Figure 6: Graphical derivative of and

Example 2.6:

[edit]

Consider . Using the identity for the sum of two angles

we can express the tangent of the complex number as

Multiplying the numerator and the denominator by the conjugate of the denominator gives:

Then

As , . In effect,

Figure 7: Graphical derivative of and

Example 2.7:

[edit]

Consider . Then using the complex variable version of the hyperbolic sine

Then,

As , and the numerical derivative is recovered as approaches zero. In effect,

Figure 8: Plot of and

Example 2.8:

[edit]

Consider . Then using the complex variable version of the hyperbolic cosine

Taking

As , and the numerical derivative is recovered as approaches zero. In effect,

Figure 9: Plot of and

Example 2.9:

[edit]

Consider the function with . Expressing the complex number in its polar form.

Taking the imaginary part and dividing by

As

This form is an indeterminate form of the type . Applying L'Hôpital's rule we obtain,

Figure 10: Plot of and

Example 2.10

[edit]

Consider the function with . Expressing in its polar form,

Using the fact that and for small that ,

As ,

Figure 11: Plot of and

Example 2.11

[edit]

Consider the function . Replacing by , (notice that we need to replace all occurrences of with )

As , , and , hence,

and the exact derivative is obtained.

Figure 12: Plot of and

Example 2.12

[edit]

Consider the function . Replacing by ,

Multiplying the numerator and denominator by yields

Extracting the imaginary part and dividing by yields

In the limit as , , , and . As a result,

Figure 13: Plot of and its derivative

Example 2.13

[edit]

Consider the function . Replacing by ,

Extracting the imaginary part and dividing by yields

In the limit as , , and .

Note, the first few terms of the Taylor series for

,

hence,

.

As a result,

Figure 14: Plot of and its derivative

3 Cauchy-Riemann (CR) form

[edit]

Although it is almost always possible to use complex algebra within any given situation, it is valuable to understand the Cauchy-Riemann form of complex numbers and how to use this form for computing derivatives. One example would be where one needs to solve a complex system of equations but only has access to a real-variable solver. In this case, one can expand the complex system into a larger but real-only system of equations, then apply the real-variable solver.

Any complex number can be written as a matrix of all real numbers using the Cauchy-Riemann matrix form, i.e.

Eq. 12

In the CR matrix, the real part is located along the diagonal and the imaginary part is located on the off diagonals. Note, that the matrix is non-symmetric.

Thus, to differentiate a function using CTSE, one can compute , where is the small step size. After evaluation of , the result will be of the form . Hence, the derivative will be located in the 2nd row, 1st column, times the step size . Thus, the equivalent form of the equation is to extract the value from the 2nd row, 1st column, and divide by . Several examples that demonstrate how to compute the derivative using the CR matrix form of a complex number are provided below. Note, however, that functions of matrices are often required. For example, must use the matrix sine function.

Example 3.1:

[edit]

Consider the simple function .

Hence, .

Example 3.2:

[edit]

Consider the simple function .

Hence, . The exact derivative is recovered as .

Example 3.3:

[edit]

Consider the simple function . Here we use to indicate the matrix exponential function. The derivative is determined as, . The exact derivative is recovered as .

4 Numerical step size study

[edit]

The comparison of the effect of the step size on the accuracy of the derivative can be shown through the following example. Consider the single variable function shown below and its analytical derivative evaluated at , taken from Reference [ref Fike]. The exact values given up to 15 significant figures are and

Eq. 13

The derivative of this function is computed using the forward difference (FD) formula,, the central differencing (CD) formula, , and CTSE, each as a function of step size. The results are plotted in below with all computations computed as double precision with 15 significant digits. The relative error is defined as , where is the derivative computed by FD, CD, or CTSE. As shown, the accuracy of the forward differencing method increases as the step size is initially reduced due to a reduction in truncation error; however, for step sizes below , the error increases due to subtraction error dominating the truncation error. Similarly, the central differencing method shows improved accuracy until a step size of has been reached, beyond which the error increases. As a result, one can see that neither the forward nor central differencing methods can achieve machine precision for any step size. In addition, this “best” step size is problem and parameter dependent, that is, there is no straightforward way to know a priori the appropriate step size, (although an optimum step size has been proposed that requires an approximation to the second order derivative [Haftka]). These methods behave as they do due to the conflict between truncation error (smaller step size is better), and subtractive error (larger step size is better). CTSE only suffers from truncation error, hence, the smaller the step size the better the accuracy, with no limit until machine precision is

Figure 15: Accuracy of forward difference (blue), central difference (purple), and CTSE (red) as a function of step size for the function evaluated at

reached. In fact, a step size of the relative error, defined as

, is less than , which is machine precision. (Note, the relative error was computed as zero for some steps sizes. As zero cannot be plotted in a logarithmic plot, a value of 1E-16 was used.) In section 1, we discussed why the step size must be greater than . In fact, one can continue the step size study all the way until and verify that the relative error will be approximately . Note, there is nothing exceptional about the function chosen for the step size study. Almost identical results will be obtained for any other function.

5 Numerical evaluation of special functions

[edit]

It is quite convenient to use CTSE to compute the derivative for any number of numerical functions such as the Error and complementary Error functions (Erf, Erfc), the Gamma function, the Beta function, the Mathieu function, inverse trigonometric functions, elliptic integrals, Airy functions and many others, as these functions have complex variable versions.

One can see CTSE in action by plotting , and as a function of for several special functions as shown below. As expected, the function and its derivative are contained in the plot. Several numerical examples are provided.

Figure 16: Plot of and
Figure 17: Plot of and dErf(x)/dx
Figure 18: Plot of the complementary error function and its derivative
Figure 19: Plot of Gamma function and its derivative
Figure 20: Plot of the Bessel function of the first kind and its derivative
Figure 21: Plot of the elliptic integral of the first kind and its derivative
Figure 22: Plot of Beta function and its derivative
Figure 23: Plot of the Airy function and its derivative

6 Finite element example using a beam element

[edit]

A simple finite element example using an Euler-Bernoulli beam element is shown in Figure 23. It is straightforward to apply CTSE to compute derivatives with respect the model parameters and obtain exact results as .

The beam is under load and has displacement, , and rotation, , degrees of freedom at each node. The stiffness matrix for a beam element of elastic modulus , cross sectional moment of inertia , and length , with displacement, , and rotation, , degrees of freedom at each node is

After applying the constraints at the right end , the system of equations becomes

The solution to this system of equations can be obtained as

yielding

6.1 Sensitivity with respect to loading P

[edit]

Consider the computation of the derivative of the displacement and rotation with respect to the applied load . The exact solutions are

Based on the CTSE method, we perturb the load by and solve the system of equations

The solution is

The derivatives of the displacement and rotation with respect to the applied load are obtained exactly as

6.2 Sensitivity with respect to elastic modulus E

[edit]

Consider the computation of the derivative of the displacement and rotation with respect to the elastic modulus. The exact solutions are

Based on the CTSE method, we perturb the load elastic modulus by and solve the system of equations

The solutions are

Applying the complex conjugate to and , the derivatives with respect to the elastic modulus are

We see that as , the exact results are obtained.

6.3 Sensitivity with respect to length L

[edit]

Consider the computation of the derivative of the displacement and rotation with respect to the length of the beam. The exact solutions are

Based on the CTSE method, to compute the sensitivity with respect to the element length, we perturb the length by and solve the system of equations

The solution to this system of equations is

The derivatives of the displacement and rotation with respect to the beam length are approximated as

We see that the displacement derivative is exact up to and the derivative of the rotation is exact. We see that as , the exact results are obtained.

Figure 24: Schematic of cantilever beam under load P

7 Numerical integration

[edit]

CTSE can be applied within numerical integration algorithms to compute the derivatives with respect to parameters within the integrand and the integration limits. The following examples demonstrate this using two numerical integration methods, Simpson's rule and Gauss-Legendre quadrature; however, other numerical integration algorithms can be used in an equally straightforward way. The important point to realize is that no changes are required to the numerical algorithm. In effect, CTSE can be implemented unaltered with the exception of perturbing one of the parameters along the imaginary axis by a small step size .

7.1 Simpson's rule

[edit]

The particular Simpson's rule used for the demonstration is Simpson's 1/3 rule, defined as

Eq. 14

where is the function to be integrated.

To compute derivatives with respect to parameters , we perturb the parameter of interest, compute the integral, extract the imaginary component, then divide by the step size. To be more explicit, to compute , we define and use this value in the formula for Simpson's rule, that is

Eq. 15

and similarly for . Each perturbation of , and is conducted independently; hence, two analyses are required to compute derivatives with respect to , and . It should be clear that the integration is conducted using real variables and not integration within the complex plane.

Example 7.1.1:

[edit]

Consider the integral with the analytical solutions , . The integral and the derivatives with respect to , and were computed using numerical values and perturbing each in turn by a step size of . For example, to compute the derivative with respect to , we perform the numerical operations

Eq. 16

The results are shown in Table In this case, the evaluation of the integral and the derivatives with respect to and are accurate to within machine precision, i.e., 15 decimal places.

Notice that it is not necessary to have any special consideration when computing derivatives with respect to a parameter in the integration limits, that is, we do not need to explicitly apply Leibniz's rule for this case, that is, for example .

Example 7.1.2:

[edit]

Consider the integral where and are constants. In this case, we will compute the derivates with respect to the limits and , and a parameter contained within the integrand. The exact indefinite integral (using integration by parts) is and the exact definite integral is . The exact derivatives of this integral with respect to the constants are: , and .

The analytical derivatives are:

The computation of the derivatives and are computed as in the previous example, i.e., one can use equation 15 to compute . The derivative is conducted by applying Simpson's rule to the function using CTSE. Simpson's rule becomes for this case

The results are shown in Table 2 using the values and . These results indicate that neither the integral nor the derivatives are exact. This is an important point. Although we have shown that CTSE can compute derivatives of machine precision, this does not mean that the derivatives are always exact - it depends upon the application. In general, if the real-valued operation, Simpson's rule in this case, is not exact, neither will be the derivatives. While there is no formal proof, empirical results usually show that the derivative results are slightly less accurate than the real-valued results. For example, for this case, the relative error in the computation of the integral using Simpson's rule is , whereas the relative error in the derivatives are . Of course, improved results for the integral and the derivatives could be obtained by subdividing the integral and applying equation 14 repeatedly.

Accuracy analysis of dI/dc
[edit]

A natural question arises as to the accuracy of the derivatives. It is instructive to compute the “best” estimate of that can be obtained using Simpson's rule in order to compare with the results computed using CTSE. To determine the “best” solution for we take the derivative inside the integral and analytically differentiation the integrand before integration. In this way, no error is incurred during differentiation and all error will be due to the discretization employed within Simpson's rule, for example

Applying Simpson's rule to this integrand with yields the result , identical to the CTSE result shown in Table 2 - in fact, the two solutions agree to 15 decimal places. This example supports the claim that CTSE, applied with a sufficiently small step size, will provide the most accurate derivative possible and that all errors in the result are due to errors in the numerical method itself. In fact, the claim that CTSE will provide the “best” possible derivative is a general claim that is true regardless of the numerical algorithm. Figure 24 below shows the convergence of as a function of step size. The results converge at about a value of and are constant thereafter. This behavior is typical when applying CTSE within numerical algorithms. While CTSE does not add any error, the accuracy of the derivatives will be dependent upon the numerical methods employed.

Figure 25: Step size study of the convergence of dI/dc as a function of step size using CTSE employed within Simpson's rule

7.2 Gauss-Legendre Quadrature

[edit]

Gauss-Legendre quadrature is a popular numerical integration scheme. In this case the integral is computed using the formula

where are the weights and evaluation points, respectively, and defines the number of integration points. Gauss integration with integration points can exactly integrate polynomials of order . Consider 3 point Gauss integration with evaluation points and weights , respectively. The derivatives of the integral using Gauss integration can be obtained using CTSE by perturbing or as or . For example, can be computed as

Eq. 17

Eq. 18

and similarly for

Example 7.1.3:

[edit]

Consider the integral with the analytical solutions , . In this case, the Gauss integration with 3 integration points is exact. The integral and derivatives were computed using the values . The derivatives with respect to and computed using CTSE are also exact and agree with the values shown in Table

Example 7.1.4:

[edit]

Consider the integral , where now the integrand contains a parameter . Derivatives with respect to , and were computed as shown in equation 17. Derivatives with respect to were computed using the formula

Eq. 19

Eq. 20

8 Newton-Raphson Method

[edit]

The Newton-Raphson Method is an algorithm that can be used to approximate the root of a function. The estimate of the root of function at iteration can be estimated from the result at step as

Eq. 21

Rewriting Equation 21 with the help of the CTSE where , then and and one obtains

Eq. 22

Example 8.1:

[edit]

The goal is to find the value of such that , or in other words, find the root of the equation ; therefore, we define . The Newton-Raphson equation becomes

This function converges quickly to the numerical value x=0.739085133215161 as shown in the table and figure below starting from an initial value of .

Figure 26: Convergence of the Newton-Raphson algorithm for the function f(x)=cos(x)-x starting from a value of x=0.2. The dashed red line indicates the exact solution within 15 decimal places.

9 ODE solver

[edit]

CTSE can be applied within an ODE solver to compute derivatives of the ODE solution with respect to parameters in the differential equation. The beauty of CTSE is that no alterations to the numerical algorithms are required; merely perturb the parameter of interest using a small step size along the imaginary axis, then use complex algebra during the computations. The results will be complex with the derivatives times located in the imaginary part.

In this section, two simple ODE solution methods are employed to demonstrate the approach but CTSE can be applied within more sophisticated ODE solution methods in a similar manner. The examples are provided for first order ODEs but derivatives of parameters for higher order ODEs can be obtained similarly.

Consider the first order ODE

with the initial condition with the values . The solution to the ODE, , will be determined along with the derivatives: . The derivates will be determined using CTSE during the solution process to determine .

The analytical solutions are (using ):

The ODE to be solved, using , is

9.1 Euler's method

[edit]

Euler's method uses the knowledge of at an initial point to project the solution to , that is

with is the step size for the ode solver; this step size is much different than the CTSE step size. In effect, the slope is assume to hold for the entire step .

To compute derivates of with respect to parameters within the ODE, we solve the following ODEs using a small step size, e.g., h=10^{-10}, applied to the parameter of interest. In all cases, the derivative is determined as typical using equation 6.

Derivative

To compute , we perturb the parameter as with . This results in the ODE

with .

Table 5 below shows the first 4 steps taken using Euler's method using an ODE step size of , and for CTSE. The derivative, , can be obtained as always using CTSE by taking the imaginary part and dividing by . The results indicate that the relative error of is slightly larger than for and that both grow larger with increasing .

Derivative

To compute , we perturb the parameter as with . This results in the ODE

with .

Table 6 below shows the first 4 steps taken using Euler's method using an ODE step size of , and for CTSE. The derivative, , can be obtained as always using CTSE by taking the imaginary part and dividing by . The results indicate that the relative error of is slightly larger than for and that both grow larger with increasing . The results for are identical to those in Table 5.

Derivative

To compute , the differential equation is unchanged but we perturb the parameter of the initial condition as with . This results in the ODE

with .

Table 7 below shows the first 4 steps taken using Euler's method using an ODE step size of , and for CTSE. The derivative, , can be obtained as always using CTSE by taking the imaginary part and dividing by . The results indicate that the relative error of is slightly larger than for and that both grow larger with increasing . The results for are identical to those in Table 5.

Results for , , , and from solving the ODE using Euler's method with step size and CTSE step size (red dots) compared with the analytical solutions (solid line).

10 Cauchy-Riemann (CR) representation of a complex number

[edit]

CTSE requires support for complex variables. Although all modern languages have this support, it may be needed in some cases, or more convenient, to augment an existing code using the Cauchy-Riemann (CR) representation of a complex number rather than using complex variables. The CR matrix form of a complex number is


where and are real numbers.

As shown, the real value is always located along the diagonal with the imaginary number located in the off-diagonal locations differing by a minus sign. Therefore, in a CR formulation of CTSE, the real number can be retrieved from the appropriate location of the CR matrix, e.g., , and the imaginary number retrieved as .

The important aspect of the CR representation is that the matrix contains all real numbers. In effect, mathematical operations on complex numbers can be replaced by matrix operations of all real matrices. For example, consider. This same result can be obtained from the matrix multiplication of the equivalent CR matrices, e.g., . Therefore, we can implement CTSE without complex numbers by using the CR form of the complex number and matrix operations. In fact, all the closed-form examples listed above can be solved using CR matrices. One important point though, is that matrix functions must be used for intrinsic functions such as the , and square-root functions.

Example 10.1:

[edit]

Consider the simple function . Then

We see in that for this example CTSE is exact as there are no higher order terms with respect to .

Example 10.2:

[edit]

as . Note, that we must use the matrix function for sine.

Example: Finite element example using a beam element

[edit]

The finite element example presented in Section 6 is repeated here and solved using the Cauchy-Riemann form. Along the way, the conversion from complex to CR form is demonstrated. This example is instructive because it shows how the complex-variable system of equations can be solved using real numbers. In other words, if a complex-variable solver is not available, then the CR form can be employed in conjunction with a real-variable solver.

Formulation of the Cauchy-Riemann system of equations

[edit]

The complex-variable system of equations, is reformulated to consist of only real numbers. To convert the system of equations to the CR form, each complex variable must be expanded to a matrix of all real numbers.

For example, , etc. The right hand side must be expanded for compatibility, that is, . The complete system of equations for this example becomes

.

The deflection and rotation can be extracted from rows 1 and 3 from column 1, and their derivatives can be obtained from rows 2 and 4 of column 1 after dividing by h.

Numerical Example

[edit]

A numerical example will be presented for clarity using the numerical values: , i.e., rectangular cross section of 2 cm in each direction, load , and . As a result, the stiffness matrix and right hand side become

, and . Solving this system yields .

Sensitivity with respect to load P

[edit]

Consider the case where the load is perturbed as . The stiffness matrix is real only (no imaginary terms required) as

, whereas the right hand side becomes Solving for the complex-variable degrees-of-freedom as yields

.

Extracting the real and imaginary parts and dividing the imaginary parts by yields the results .

The CR system of equations becomes

.

Solving the system of equations yields .

From these results we obtain values that agree with the complex-variable solution, namely .

Sensitivity with respect to elastic modulus E

[edit]

Consider the case where is perturbed as . Then the complex stiffness matrix becomes

.

Solving for the complex-variable degrees-of-freedom as yields

Extracting the real and imaginary parts and dividing the imaginary parts by yields the results .

The CR system of equations becomes

.

Solving the system of equations yields .

From these results we obtain values that agree with the complex-variable solution, namely .

Sensitivity with respect to length L

[edit]

Consider the case where the length is perturbed as . Then the complex stiffness matrix becomes

Solving for the complex-variable degrees-of-freedom as yields

.

Extracting the real and imaginary parts and dividing the imaginary parts by yields the results .

The CR system of equations becomes

. Note that the stiffness matrix is no longer symmetric.

Solving the system of equations yields

From these results we obtain values that agree with the complex-variable solution, namely .

11 Overloading functions to provide CTSE support

[edit]

There are some functions for which CTSE will not work. For example, a complex-variable version may not exist, or the complex-variable version may return a only a real result, or the complex-variable implementation may not provide machine precision derivatives. For example, the inverse error function, , does not have a complex-variable version. Another situation is where a functions has a complex-variable form but the result is real only. Such a case is the Absolute function, that is . Third, in some implementations of the inverse trigonometric functions, the relative error can be reduced to but no further. However, as shown below, an artificial “pseudo-complex” version can be created such that CTSE can be used in its standard form for all these cases. In other words, when the CTSE equation 6 is applied, the pseudo-complex version is called which returns the correct result in the imaginary part.

The key is to develop a first order complex-variable Taylor series of the form

Eq. 23

for the function of interest. Here , represents the derivative of the function evaluated at , as a result and equation can be written

Eq. 24

Thus, the only requirement is to define the imaginary portion of the function as so that when equation 6 is applied

11.1 Example:

[edit]

function

To fix ideas, let's compute an overloaded function for cosine that implements equation 24. Clearly we don't need to do this since there exists complex versions of the cosine function but it instructive to do so. Applying equation 24, we have

Eq. 25

Hence, when this version of cosine is called with a complex argument, the imaginary result will be . Therefore equation 6 will return - as the derivate. It should be clear that this version of the complex-variable cosine function is not equivalent to the standard cosine function. It it only useful for computing derivates using CTSE.

11.2 Example:

[edit]

function

Consider the absolute operator . CTSE does not work on this operator since the operator applied to a complex number yields a real result, i.e., . However, we can derive a function that uses the real function as and its derivative as the imaginary part, . Therefore, we can define the overloaded for a complex argument of as

Eq. 26

Then if we evaluate, for example, for any and for any . A plot of the overloaded function is shown in Figure 26.

Figure 27: Plot of the Absolute function and its derivative

11.3 Example:

[edit]

function

As another example, consider the inverse error function, . In the current programming languages only accepts a real variable. We need will overload this definition to support complex arguments.

This function accepts a real argument between .

The derivative is . Hence, the overloaded function for a complex argument is

Eq. 27

A plot of the this function and its derivative is shown in Figure 27.

Figure 28: Plot of the InverseErf function and its derivative

12 Miscellaneous topics

[edit]

A number of miscellaneous topics are discussed below that may be of interest to the reader.

12.1 Accuracy of the numerical derivatives computed by CTSE

[edit]

The CTSE method itself can obtain machine precision accuracy, typically 15 decimal places, on simple functions; however, the accuracy of the derivative depends upon the numerical algorithm to which it is being applied and the output that is being computed. That is, the accuracy of the derivative will only be as good as the accuracy of the numerical method itself. This result was clearly shown in the application of CTSE within numerical integration using Simpson's rule, see Table Experience to date indicates that the derivatives computed using CTSE are usually of the same or one order less accurate than the real-variable results. For example, if the relative error of the real-variable result was of the order , the relative error of the derivative will be of the order . Results to date are entirely empirical, no formal mathematical investigation has been conducted.

12.2 Computing mixed and higher order derivatives using CTSE combined with the finite difference method

[edit]

A natural question is whether CTSE can be used to calculate higher order derivatives such as second order. The short answer is no, CTSE cannot be used to calculate higher order derivates without subtractive error. For example, the application of CTSE to compute derivatives of second order, say by taking 2 steps up the imaginary axis, leads to a subtraction of terms, and, hence, the method becomes step size dependent and it loses significant accuracy. To compute higher order derivatives without subtractive error requires multicomplex or multidual variables and algebra. These topics are covered in subsequent chapters.

However, one can combine CTSE for first order derivatives with the classical finite difference estimation for first order derivatives to obtain second order derivatives that are numerically superior to using the classical finite difference method for second order derivatives. In a nutshell, one can apply a first order finite difference formula to the derivatives estimated by CTSE to approximate a second order derivative, i.e.,

Eq. 28

Contrast this equation with the classical second order finite difference formula

Eq. 29

Using CTSE to compute , yields the equation

Eq. 30

Eq. 31

In contrast to Figure 1 where the perturbation is conducted along the real OR imaginary axes, this method perturbs the variable of interest along BOTH axes simultaneously. As shown in figure 28, one perturbs the variable by along the real axis and along the imaginary axis. However, as we know, . As an example, consider the second order derivative estimate of the cosine function computed using equation 31,

Figure 29: Strategy for computing second order derivatives using CTSE and forward differencing

Figure 29 below shows the relative error of the second order derivative of evaluated at as a function of the real-variable finite difference step size . For all runs, the CTSE step size was set equal to for all analyses so that the first order derivative computed using CTSE was accurate to machine precision. Hence, all inaccuracy is due to the truncation and subtractive error of the finite difference formulas. As shown in the figure, three additional orders of magnitude reduction in the relative error can be obtained using a combination of CTSE for the first order derivative combined with forward differencing compared to the classical second order forward difference formula. Also, it should be noted that for value of between and the 2nd order finite difference produced grossly inaccurate estimates of the 2nd order derivatives. These values are not shown in the figure.

Figure 30: Comparison of second order derivative of at . Second order finite difference (blue triangles) vs. combined CTSE and first order finite difference (red circles); for all CTSE analyses.

12.3 Can CTSE be applied to a functions that are already complex

[edit]

Can one apply CTSE to a function that is already complex? In short, no. One cannot apply equation 6 to a function that is already complex. One must use a different imaginary axis other than (since that axis is being used in the complex function) to apply the perturbations and extract the derivative. One can, for example, use dual numbers that have an imaginary number   thereby combining complex and dual numbers.

13 The use of Quaternions to compute first order derivates

[edit]

Readers familiar with quaternions may be interested to know that they can be used to compute first order derivatives without subtractive error in the same manner as using CTSE.

Quaternions are constructed with one real and 3 imaginary axes , where , etc. The algebraic rules for quaternions are well established.

Quaternion numbers are a subset of hypercomplex numbers con-sting of one real and 3 imaginary axes . A number is defined as

where are real coefficients and are imaginary units governed by the following conditions

The value for using quaternions is that 3 first order derivates can be computed with a single analysis, whereas 3 CTSE analyses would be required. Of course, a quaternions library is needed since quaternions are typically not intrinsic to most programming languages. As such, the runtime of 3 CTSE runs may in fact be faster than a single quaternion analysis. This aspect will be implementation dependent.

The Quaternion Taylor Series Expansion (QTSE) method uses quaternion algebra to compute up to 3 first order derivatives of a real function evaluated at . In this approach, the variables , and are perturbed by a small step size in imaginary directions and respectively:

where . In a similar manner to CTSE, it is recommended that be or smaller.

The function is now evaluated as a quaternion to yield . The real-variable result and the derivatives can be extracted from the imaginary axes of the quaternion evaluation of the function as

where extracts the real coefficient of the quaternion number, extracts the coefficient associated with the imaginary direction , etc. Since no subtractions are performed during quaternion algebra operations, the method does not suffer from subtraction cancellation error. In practical numerical applications, the minimum possible error is usually bounded by machine error.

Example: where are quaternions

Consider the sine function where and are quaternions. In this case the quaternion Taylor series method is used to compute the derivatives with respect to and evaluated at . The and quaternions are defined as shown in equation 37 as

where .

Using the previous equations, the derivatives can be obtained to machine precision using a step size of or smaller.

where means to extract the real coefficient of the direction, and similarly for and .

The results are , , and .

These results are accurate to machine precision, extensions to quaternions can also be used to determine multiple first order derivatives. For example, octonions (one real and 7 imaginary axes) can be used to compute seven first order derivatives in a single analysis; sedenions (one real and 15 imaginary axes) can be used to compute seven first order derivatives in a single analysis, etc.

14 Comments on the accuracy of the CTSE method

[edit]

In many publications, and repeated here, it is stated that CTSE will provide machine precision derivatives if sufficiently small h is used. This is true if the function to which it is being applied provides results that are also machine precision accurate. In general, the accuracy of the derivatives from CTSE will be no more accurate than the algorithm in which it is being employed. In the application to Simpson’s rule, the accuracy is only ∼, see Figure 25. This is because the accuracy of Simpson’s rule for the real part is only ∼. This behavior is true in general. A general rule of thumb is that the derivatives are slightly less accurate than the real part. However, it is true that CTSE (with small enough h) will provide the most accurate answer possible for the particular application, that is, the accuracy will be the same as if the analytical derivative was taken first, then the numerical algorithm applied.

15 Background reading

[edit]

There are several excellent references that discuss CTSE and provide examples. As we have indicated, CTSE is not limited to any particular mathematical or science domain. It has been applied successfully in a myriad of disciplines. The article by Squire and Trapp, 1998, provides a concise summary of CTSE. Martins, et al. describe a scripting procedure that can be used to automatically modify Fortran and C/C++ codes to implement CTSE. There have been multiple applications of CTSE within various domains such as fluid mechanics, eigensensitivity analysis, heat transfer, fracture mechanics, 35 nonlinear structural mechanics, among others. More information regarding the use of quaternions for obtaining derivatives can be obtained from the references by Turner and Cano.

References

[edit]
  1. W. Squire, G. Trapp, "Using complex variables to estimate derivatives of real functions," SIAM Review 40 (1998) 110–112
  2. J.J. Martins, "The complex-step derivative approximation," ACM Transactions on Mathematical Software 29 (2003) 245–263
  3. W.K. Anderson, J.C. Newman, D.L. Whitfield, E.J. Nielsen, "Sensitivity analysis for Navier-Stokes equations on unstructured meshes using complex variables," AIAA Journal 39 (2001) 56–63.
  4. B.P. Wang, A.P. Apte, "Complex variable method for eigensolution sensitivity analysis," AIAA Journal. 44 (2006) 2958–2961.
  5. X.W. Gao, M.C. He, "A new inverse analysis for multi-region heat conduction BEM using complex-variable differentiation method," Engineering Analysis with Boundary Elements 29 (2005) 788–795.
  6. H.R. Millwater, D. Wagner*, A. Baines*, and A. Montoya, “A Virtual Crack Extension Method to Compute Energy Release Rates using a Complex Variable Finite Element Method,” Engineering Fracture Mechanics 162 (2016) 95–111, https://dx.doi.org/10.1016/j.engfracmech.2016.04.002
  7. A. Montoya, R. Fielder*, A. Gomez-Farias*, H. Millwater, “Finite Element Sensitivity for Plasticity using Complex Variable Methods,” J. Eng. Mech. 141 2 (2014), https://ascelibrary.org/doi/abs/10.1061/%28ASCE%29EM.1943-7889.0000837.