# Talk:Derivation of the conjugate gradient method

Jump to: navigation, search
WikiProject Mathematics (Rated Start-class, Low-importance)
This article is within the scope of WikiProject Mathematics, a collaborative effort to improve the coverage of Mathematics on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
Mathematics rating:
 Start Class
 Low Importance
Field:  Algebra

## Conjugate Gradient method is not derived here

This is not a derivation at all. This just lists what to do to do this method. Where does the quadratic equation f(x)=(b-x)^T A (b-x) come from? You can't call this a derivation and start with a function. The derivation should lead to this equation, and also lead to the expressions for alpha and p. Also, why is it that you want to minimize this function? What does this function represent that the entire method is based on minimizing this function? After reading a derivation, it should be clear where all this comes from. This is like saying, 'this is what works, therefore it is derived.' Leftynm (talk) 16:54, 31 January 2011 (UTC)

Like it or nor, if you bother to read their ground-breaking paper at all, you will see that this is the published way Hestenes and Stiefel developed CG. Hestenes and Stiefel introduced the function ${\displaystyle f({\boldsymbol {x}})}$ to show how the conjugate directions method can be viewed as a relaxation method. That function so happens to induce successive minima along the search directions that correspond to the iterates of CG. This could well be a coincidence, whose origin you simply cannot question. Some more common sense tells you that this section is unfinished. Hestenes and Stiefel devoted an entire section in their paper to deriving CG from CD, which just cannot be squeezed into this few words. Kxx (talk | contribs) 08:20, 1 February 2011 (UTC)
Like it not, that is *not* the function Hestenes and Stiefel minimize. They minimize f(x)=(x-h)^T A (x-h) (or rather f(x)=(x-h,A(x-h), which is the same), where in there notation h is the true solution of Ax=k, i.e. h=A^{-1} k. Then they write it as f(x)=x^T A x-2 h^T x (or rather (x,Ax)-2(x,k)+(h,k), which again is the same omitting some irrelevant constant). I corrected the article to use the second form and replaced h by b for the right hand side, which is more common nowadays. (ezander) 134.169.77.186 (talk) 17:07, 3 February 2011 (UTC)