Talk:Convex function

From Wikipedia, the free encyclopedia
Jump to: navigation, search
WikiProject Mathematics (Rated C-class, Mid-importance)
WikiProject Mathematics
This article is within the scope of WikiProject Mathematics, a collaborative effort to improve the coverage of Mathematics on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
Mathematics rating:
C Class
Mid Importance
 Field: Analysis

I'm not sure what the opinion at Wikipedia is on the matter of accessibility of the material. When I come here, however, I am not looking for mathematically concise definitions of what I look up, nor do I handle well the high level description that requires knowledge of many other technical terms.

A paragraph describing the significance of the defining equation would be nice.

For me at least, this page is exactly what I was looking for about convex functions.

Correct Definition[edit]

Should the definition read "for all x,y" instead of for any?

To anyone used to reading math (and maybe most who aren't), it's the same thing. But the change might improve clarity some small amount. Dchudz 17:08, 15 November 2006 (UTC)

But anyone writing math would never say 'for any', as strictly this requires the condition to be satisfied just for some, and not for all instances. In this article, I am missing the 'useful theorems' section for higher dimensions, as well as a generally better treatment for the high dimensions. But, I still appreciate your work!

Merge with concavity?[edit]

Honestly, think that's a bad idea. Although the two concepts are so completely intertwined as to be two sides of the same coin, people come to Wikipedia for answers, not answer-hunting (reference the post above). I suppose it's possible that the merge could be done in such a way as to not be too confusing, but take a look at supermodular to see what can happen when things get thrown together (read the title, and then the definition; but then I suppose I'm supposed to fix it, not just sit back and complain). All in all, it would probably be better just to interlink the two (concavity and convexity) religiously -- say a link to concavity in the main definition. This is a good page right now: concise and exactly what people are looking for. Don't see much need to change it. The preceding unsigned comment was added by Semanticprecision (talk • contribs) .

OK, I did not do any merge, rather moved concavity (which is now disambig) to concave function. Things might still need merging in the future, but at least for now the meaning of concave set and concave function are separate and not in the same article (with concave function taking the lion share of the room). Let us see how it goes. Oleg Alexandrov (talk) 00:28, 30 December 2005 (UTC)
In February 2005, I merged convex and concave function in the German Wikipedia (de:konvexe und konkave Funktionen). IMHO it is very difficult to maintain the versions for convex/concave funtion in a consistent way, and it is easy to write the article in not too confusing way. To avoid problems like in supermodular, clearly the title should mention both (e.g. convex and concave functions, as in the German version). --NeoUrfahraner 09:15, 2 January 2006 (UTC)


Removing link to Additive Inverse[edit]

The link from "opposite" ("The opposite of a convex function is a concave function") to the additive inverses page seems a bit gratuitous. I'm removing it. --Dchudz 14:13, 25 July 2006 (UTC)

The example with f(x)=x4[edit]

I didn't want to modify the article, but there is something suspicious to me. It says:

A twice differentiable function of one variable is convex on an interval if and only if its second derivative is non-negative there; this gives a practical test for convexity. If its second derivative is positive then it is strictly convex, but the converse does not hold, as shown by f(x) = x4.

More generally, a continuous, twice differentiable function of several variables is convex on a convex set if and only if its Hessian matrix is positive semidefinite on the interior of the convex set.


Shouldn't the Hessian reduce to 2nd derivative in one-dimensional case?
The example function f(x) = x4 is convex and its second derivative 12x2 is nonnegative, so I don't see why the converse does not hold. Moreover in the previous sentence it says "if and only if", so the converse MUST hold. Did I miss something ? —Preceding unsigned comment added by 83.37.136.53 (talk) 09:55, 18 January 2008 (UTC)

As far as I see, the article says that if the second derivative is strictly positive, then the function is strictly convex, but if the function is strictly convex, its second derivative need not be strictly positive, with x^4 being a counterexample (this function is indeed strictly convex, but the second derivative is not strictly positive, it also takes the value 0). Oleg Alexandrov (talk) 15:36, 18 January 2008 (UTC)

Nice! what if the second derivative is f``(x)=A(x-x1)^2(x-x2)^2 ? The second derivative is zero at 2 points. "f(x) defined on an interval is called convex (or convex downward or concave upward) if the graph of the function lies below the line segment joining any two points of the graph." ... it should be on or below a line? you could have a continuous set of points where f(x) is zero. -Alok 09:25, 30 August 2012 (UTC)

semi-strictly convex[edit]

what is "semi-strictly convext" of a function? Jackzhp (talk) 02:09, 30 November 2008 (UTC)

Strongly convex[edit]

It would be great to add a section with the definition of a strongly convex function; or, make a page for "strongly convex" (since it appears that it doesn't yet exist) and then link to that page. 71.130.221.31 (talk) 02:46, 2 February 2009 (UTC)

Well, I'd like to know what a "strongly convex" function is? --Bdmy (talk) 12:27, 2 February 2009 (UTC)
If a function is twice continuously differentiable, then is is convex if and only if its second derivative is never negative (and it's strictly convex if in addition the second derivative is never zero). A strongly convex function's second derivative is bounded away from zero.
Following Boyd and Vandenberghe's book, we have:
A twice continuously differentiable function is "strongly convex" if
 \nabla^2 f(x) \ge mI
for all x in the domain. The inequality is with respect to the positive semidefinite cone.
This let's us extend one of the fundamental properties of a convex function (that a convex function is always above a tangent plane) to the following:
 f(y) \ge f(x) + \nabla f(x)^T (y-x) + \frac{m}{2} \|y-x\|_2^2
for all x and y in the domain. If the function were merely convex, the final term with the m wouldn't be there, and if the function were merely strictly convex, the final term wouldn't be there either, but the inequality would become a strict inequality.
The constant m depend on the function, and we need m > 0 otherwise it's not useful. Strongly convex functions are generally nicer than just convex functions when solving an optimization problem. Lavaka (talk) 21:13, 22 February 2009 (UTC)
I added this to the main page, since it's not too long and I think it's useful (but probably doesn't deserve it's own page).
I'm sure about my comments regarding strong convexity, however I'm unsure about some of the strict convexity comments (as the x^4 discussion above shows. update: I'm fixing my strict convexity comments Lavaka (talk) 02:06, 10 September 2009 (UTC)
Dear Lavaka, your comment above may be misunderstood as stating a C2 function is strictly convex if and only if its second derivative is strictly positive ; such statement is wrong; and I corrected it in a previous version of this article. The correct definition of strictly convex is now in the article; and it is the definition found in many books (such as link the rockafellar book). Mennucc (talk) 13:17, 17 February 2010 (UTC)

Hi folks. I'm wondering if there's a typo in the final example in the "Strongly convex functions" section. Shouldn't f' be replaced by f'' in the sentence starting "For example, consider a function f that is strictly convex"? 67.186.29.135 (talk) 18:15, 2 November 2012 (UTC)

Equivalent definition[edit]

I was just told that an equivalent definition is the following: a function is convex if at any point we can find a suppporting line (a line through the point that lies entirely below the function), i.e. f is convex if for any a, there exists an m such that f(x) \ge f(a) + m(x-a) for all x. (It's easy to see why if that condition is true the function is convex, but I can't immediately think of a proof for the other direction.) Wondering if someone knows a source.... I could only find Springer EOM, which indicates that maybe this definition is equivalent to midpoint convex. Shreevatsa (talk) 01:15, 18 February 2009 (UTC)

In the real case you just need to take the value m to be between the left- and right-derivatives of ƒ at a (assuming a is an interior point of the domain, otherwise you can't do it in general). In several dimensions, what you need is more or less the finite dimensional part of the Hahn-Banach theorem. --Bdmy (talk) 11:21, 18 February 2009 (UTC)
Is it not possible that f does not have left- and right-derivatives at a? Anyway, so what is the condition equivalent to? (There is a cute proof of Jensen's inequality assuming this condition: take a to be E[X], then take expectation.) Shreevatsa (talk) 00:12, 20 February 2009 (UTC)
A (finite) convex function on an interval is continuous in the interior of the interval, and has left- and right-derivatives at every interior point a. This can be found in most books on functions of one real variable, together with an illustration of the fact that the "slopes" are increasing: when a < b < c,
 \frac{f(b) - f(a)}{b - a} \le \frac{f(c) - f(a)}{c - a} \le \frac {f(c) - f(b)}{c - b}.
(imagine, or draw for yourself, the picture that goes with these inequalities) --Bdmy (talk) 12:34, 20 February 2009 (UTC)

This is somehow related but have anyone see this equivalent definition, something like: a function is convex if and only if for any arbitrary E

\sup_{\partial E} f = \sup_E f

where I don't remember what E should be :) a compact set?. The above is incorrect in the way it is written; I don't remember the exact formulation. I also remember the converse of Jensen inequality is true (under some conditions). (This was an exercise in Rudin's real and complex analysis.) So, it should give another equivalent definition. -- Taku (talk) 13:42, 18 February 2009 (UTC)

Convex Decreasing[edit]

It would be good to have a picture of a convex decreasing function as well. See Indifference curve. Trogsworth (talk) 19:12, 5 May 2009 (UTC)

Counterexamples[edit]

It'd be really wonderful if along with some examples we could supply some interesting counterexamples. —Preceding unsigned comment added by 68.65.169.203 (talk) 03:16, 17 February 2010 (UTC)

Problem[edit]

The article says:

Suppose ƒ is a one variable function defined on an interval, and let

 R(x,y) = \frac{f(x) - f(y)}{x - y}

(note that R(x,y) is the slope of the red line in the above drawing; note also that the function R is symmetric in x,y). ƒ is convex if and only if R(x,y) is monotonically non-decreasing in x, for y fixed (or viceversa). This characterization of convexity is quite useful to prove the following results.

I have a problem with that: Consider the function  f(x)=x^3 . Its slope is R(x,y) = x^2 + xy + y^2. Consider the slope for y=2 fixed. Then we have R(x,2)=x^2+2x+4. On the interval [-1,\infty[ this function is monotonically non-decreasing. However, ƒ is not convex on that interval.

HermanMansta (talk) 23:42, 19 May 2010 (UTC)

necessary conditions on 2nd derivative for strict convexity[edit]

Regarding telling whether a function is strictly convex by looking at its 2nd derivative, it's interesting that the only counter-example ( f(x) = x^4 ) is one where  \, f''(x) is only zero for a single value of  x . I would speculate that for any example of a differentiable function that is strictly convex but doesn't have a strictly positive 2nd-derivative, that its 2nd derivative would still be positive almost everywhere.

I'm just speculating, but would it be correct to make the following change?

Current wording:

f\,\! strictly convex if  f''(x) > 0 \,\! for all x\,\! (note: this is sufficient, but not necessary)

I'd suggest:

f\,\! strictly convex if and only if  f''(x) \ge 0 for all x\,\! and the set  \{x \, : \, f''(x)=0\} has zero Lebesgue measure. —Preceding unsigned comment added by Bleachpuppy (talkcontribs) 22:07, 30 August 2010 (UTC)
Technically, a convex function doesn't have to have second derivatives everywhere; all we know is that the first derivative exists almost everywhere and is increasing (for convex) and strictly increasing (for strictly convex). It's likely you're correct about the second derivative condition; however, since it's not obvious, we need a reference. — Arthur Rubin (talk) 23:02, 30 August 2010 (UTC)
Look at Rademacher's theorem, which is proved in L.C. Evan's book on fine properties of real functions and probably in Francis Clarke's (SIAM-reprinted) book on nonsmooth analysis. Jeff Cheeger gave an interesting proof some years ago, which is especially interesting for metric measure spaces. Older references include Busemann's book on convex surfaces.  Kiefer.Wolfowitz  (talk) 20:47, 7 February 2011 (UTC)

ashutosh yadav[edit]

i want to discuss a simple topic related to convex function —Preceding unsigned comment added by 124.124.247.13 (talk) 19:17, 7 February 2011 (UTC)

Proper[edit]

I'm proposing that Proper convex function be merged into Convex function. If "proper convex function" is at all notable, it should be noted that a "proper convex function" on ℝn corresponds exactly to a convex function on a convex subset of ℝn. Some of the properties remain the same, although the infimal convolute is simpler to define for total functions to the extended real line than for partial functions to the real line. — Arthur Rubin (talk) 15:44, 11 March 2011 (UTC)

The definition of the conjugate convex was exactly my intention for including the example of an improper convex function. What was your motivation for reversing it as wrong and unsourced? I would state that it is obviously true (the epigraph is indeed convex). I can not source it, as it is an example I made up. However, other examples can be found in standard textbooks (for example Rockafellar, Convex Analysis). Should one better copy such an example and refer to it, instead making up a own example? Duzian (talk) 16:47, 11 March 2011 (UTC)
Your example is not convex in any sense. However, let me add to the reason that a partial convex function and a "Proper" convex function are the same concept.
Concepts:
A function f on a (convex) set contained in ℝn to ℝ is "partial convex" if for any two points x_1 and  x_2 in the domain X of f and any t\in[0,1],
f(tx_1+(1-t)x_2)\leq t f(x_1)+(1-t)f(x_2).
A function g on ℝn to ℝ+ is "extended convex" if it does not take the value −∞, and if for any two points x_1 and  x_2 in ℝn and any t\in[0,1],
g(tx_1+(1-t)x_2)\leq t g(x_1)+(1-t)g(x_2).
It can easily be seen that these are the same concept, if we define a function with empty domain to be convex: To get from the first to the second, just define g to be +∞ where f is not defined, and to get from the second to the first, you let f be g restricted to the points x where g(x) is finite.
And then, a "proper" convex function is an extended convex function which takes a finite value, which corresponds to a non-empty convex function.
This section, if we can find a source, supports what proper convex function should contain. Even if we can't find a source for that, the usage is rare, so it should still just redirect to convex function. — Arthur Rubin (talk) 17:28, 11 March 2011 (UTC)
I have zero doubts about your above arguments and you are right it is a strong argument to merge the sections. However, you do not address improper convex functions (which means they do take the value -\infty). Here is a good reason for having a section on proper convex functions: In optimization problems from engineering the term "closed proper convex function" is frequently used as a synonym for well-behaved function, especially when arguing if a minimum over a convex set is attained. This avoids some technically difficulties. Therefore one might wonder what a proper convex function is, and the difference to an improper convex function. I believe that the following function is an improper convex function
f(x) = \begin{cases} -\infty , & \mbox{if } x \le 0 \\ 0, & \mbox{if } x=0 \\  +\infty, & \mbox{if } x \ge 0 \end{cases}
Its epigraph is according to the definition
\mbox{epi} f = \{ (x, \mu) \, : \, x \in \mathbb{R}^n,\, \mu \in \mathbb{R},\, \mu \ge f(x) \} \subseteq \mathbb{R}^{n+1},
is \{ (x, \mu) \, : \, x < 0 ,\, \mu \in \mathbb{R} \} \cup \{ (x, \mu) \, : \, x = 0 ,\, \mu \in \mathbb{R} \mu \, \ge 0\}, which I believe is a convex set. However, the definition via the line segments fails. How, do you use the line segment definition to argue: "Your example is not convex in any sense". Finally, I would suggest to enrich the proper convex function section with an example for an improper convex function. For the convex function section I would add a remark that the test for convexity ("if and only if part) does not hold for functions on the extended real line. Duzian (talk) 18:34, 11 March 2011 (UTC)
You may be right. However,
  1. Your function appears to be convex in the regular sense if you define +∞ + (−∞) to be +∞, as seems appropriate for the semi-continuous nature of convexity, and
  2. The only way a function on ℝ taking a value of −∞ somewhere can be convex in either sense is, if the function is −∞ on an interval, +∞ outside the closure of of the interval, and possibly finite at the finite endpoints of the interval.
  3. We would need evidence that someone uses the term in that sense. The extension to +∞ can be looked at as a shorthand for partial functions, but -∞ is something new, and may not actually be used.
Arthur Rubin (talk) 01:09, 12 March 2011 (UTC)

Optimization and properness[edit]

Extended real-valued functions are useful for representing life-and-death constraints: Their violations receive the value of positive infinity (which is not a minimum for most erv functions). This is the keypoint, which is now absent from the article.

Please follow the terms of Rockafellar and Wets or the latest book by Borwein and Vanderwerff: There is no need for OR proposing to reformulate the basic terms of an established subdiscipline.  Kiefer.Wolfowitz 18:57, 25 April 2011 (UTC)

Explanation of revert[edit]

KW's edit summary in reverting my edits to the lead was: "convex functions need not be defined on intervals, but frequently are defined on higher dimensional sets". But this certainly suggests that he had not read past the first two sentences of the lead, where it says "More generally, this definition of convex functions makes sense for functions defined on a convex subset of any vector space." I have also moved the more equation-based definition of convexity to a "Definition" section, with some copyedits. We should keep WP:LEAD and WP:MTAA in mind. I don't see any reason that the lead should contain any equations at all. Sławomir Biały (talk) 22:15, 24 April 2011 (UTC)

Moved from User talk:Sławomir Biały:

I'm glad that you showed understanding that a convex function may be defined on a higher dimensional set.
However, you wrongly removed the epigraphic definition, and I hope that you would restore it soon.
Also, imho, it would be better to remove the primitive phrasings "concave up", etc. to the end of the introduction, perhaps in their own paragraph. The reader should meet the definition immediately, and not have to jump over many deprecated alternative definitions.
Thanks,
 Kiefer.Wolfowitz  (Discussion) 22:18, 24 April 2011 (UTC)

The epigraph definition also remains in the first paragraph of the lead. Also, I'm not really sure that "concave up" means the same thing as convexity. This is a pointwise property, that the graph of a function locally lies over the tangent hyperplane, which is slightly different. There is, of course, a strong connection between this and convexity, and in many cases of interest they are the same, but I don't think that the two should be equated in the article. Any thoughts on this? Sławomir Biały (talk) 22:28, 24 April 2011 (UTC)

I have removed the reference to "concave up" from the lead. This probably bears mentioning later on, but I think it best to discuss it further to determine in what setting. Sławomir Biały (talk) 22:34, 24 April 2011 (UTC)
Thanks! I agree that a discussion of subdifferentiability and zero subgradients should be in the body, not in the lead.  Kiefer.Wolfowitz 18:45, 25 April 2011 (UTC)
I would avoid using "concave up" and "concave down"---which used to be traditional in calculus books in the USA, alas, and only a small improvement over "bathtub shaped" or "U-shaped".  Kiefer.Wolfowitz 18:47, 25 April 2011 (UTC)

Joining any two points[edit]

The start of article is confusing "joining any two points of the graph; or more generally, any two points in a vector space." what is any two points? requires more explanation.Osmankhalid2005 (talk) 18:05, 29 January 2013 (UTC)

some questions about the section regarding the properties of convex function[edit]

In the section about the properties of a convex function it is said that if f is contiuous on some open interval, then it has one sided derivatives. As a consequence, it's differentiable at all but at most countably many points. How is this conclusion inferred from the one sided derivaatives? And what happens if f is convex on closed interval? Million of thanks for answerring these questions! :) 46.19.85.157 (talk) 10:27, 11 February 2014 (UTC) ???132.66.85.52 (talk) —Preceding undated comment added 09:30, 17 February 2014 (UTC)

We should have a source for the material, but not only does it have one-sided derivatives, but the "derivative" from I × {-,+} (with lexicographic ordering) is nondecreasing (or nonincreasing; I didn't check which). Hence the sum of all the jumps (in a subinterval) is finite. Hence there are at most countably many jump points (in a subinterval). But the interval is a union of countably many subintervals, and a countable union of countable sets is countable.
If I is a closed interval, then you can work with the interior; it's no longer the case that one-sided derivatives exist. — Arthur Rubin (talk) 07:55, 22 February 2014 (UTC)