Jump to content

Wikipedia:Reference desk/Archives/Mathematics/2020 May 24

From Wikipedia, the free encyclopedia
Mathematics desk
< May 23 << Apr | May | Jun >> Current desk >
Welcome to the Wikipedia Mathematics Reference Desk Archives
The page you are currently viewing is a transcluded archive page. While you can leave answers for any questions shown below, please ask new questions on one of the current reference desk pages.


May 24[edit]

Optimization: Necessary and Sufficient Conditions for Superlinear Convergence of Quasi-Newton Method[edit]

Hello,

I am reading *Numerical Optimization* by Nocedal & Wright, and I am having trouble understanding some aspects of the proof of Theorem . I have written the theorem, its proof, and my questions in LaTeX. I'm sorry I couldn't figure out how to make it show up nicely on Wikipedia.

I have also asked this question on Math Stackexchange, if you would prefer to see it there: https://math.stackexchange.com/questions/3686947/proof-of-superlinear-convergence-of-quasi-newton-methods-in-nocedal-wright

I have been stuck on this theorem for many hours, so any help is greatly appreciated.

There are two things I don't understand:

1) The theorem is an iff statement. The author proves one direction, but I don't see how to prove the reverse.

2) The author seems to use the assumption that the Hessian is Lipschitz, but this is not an explicit assumption of the theorem. Is this a mistake from the author? (I checked the errata and this wasn't on there)

The following are several lines the author references in the proof. The theorem and the proof follows.

[The above is where my point #2 comes from. This inequality was derived in the proof of an earlier theorem (the theorem about quadratic convergence of Newton's Method) and in that theorem we had a hypothesis that the Hessian is Lipschitz.(and we used that hypothesis to prove the above inequality)]

Suppose that is twice continuously differentiable. Consider the iteration (that is, the step length is uniformly ) and that is given by . Let us assume also that converges to a point such that and is positive definite. Then converges superlinearly if and only if holds.
We first show that is equivalent to
where is the Newton step. Assuming holds, we have that
where we have used the fact that is bounded above for sufficiently close to , since the limiting Hessian is positive definite. The converse follows readily of we multiply both sides of by and recall .
By combining and , we obtain that
A simple manipulation of this inequality reveals that so we obtain
giving the superlinear convergence result.

In an earlier edition of the book (ISBN 978-0-387-22742-9), the statement of Theorem 3.7 starts with: "Suppose that is twice differentiable and that the Hessian is Lipschitz continuous ..." [my emphasis by underscoring — L.] The statement of Theorem 3.7 in a later edition (ISBN 978-0-387-40065-5) is as above, but the form of (3.36) is subtly different:
So what edition is the above from? The presentation is not entirely self-contained; I assume that is shorthand notation for .  --Lambiam 10:29, 24 May 2020 (UTC)[reply]

Thanks for the response. I'm using the second edition; it's good to hear that the Lipschitz hypothesis appears in the first edition. Its ommision must have been a mistake on the author's part, but unfortunately it wasn't listed in the errata.

As far as the diferences in (3.36) the author does use while I use . But from going through the proof of Theorem 3.7, I am quite confident thatI am quite confident that (3.36) a little wrong; instead of , it should have

Yes indeed, stands for . I have tried to make the proof as self contained as possible; did I miss something, or are you referring to the "A simple manipulation of this inequality...."?  --BlueDream30 15:01, 24 May 2020 (UTC)[reply]