- 1 Untitled
- 2 Recent changes to this page
- 3 local extrema?
- 4 Examples
- 5 Sign
- 6 Image
- 7 a slightly more useful definition?
- 8 Another sigmoid?
- 9 Derivative Clarification
- 10 Are some of the sections talking about the logistic?
- 11 External link to Logistic Function implementation in Excel should be maintained
- 12 Sigmoid and sigmoidal
- 13 Sign of first derivative
- Right you are, I don't know what I was thinking. Fixed. (The error function is a proper sigmoid, right?)
Jorge Stolfi 10:24, 31 May 2004 (UTC)
Recent changes to this page
Isn't it better to redirect this page to the logistic function page? Or restore this page to its former glory? The current page is kinda pathetic.
Do you really mean "local minimum" and "local maximum"? The example function given clearly doesn't have any local minima or maxima (but it does have a global minimum of 0 and a global maximum of 1) -- Somebody
Perhaps what is meant is that the second derivative (curvature) has a local minimum and maximum? BTW, I do not agree that the function has global extrema, because as I learned them and as the article on them states, they are points in the domain of the function and are always also local extrema. This function has none. The image of the function has supremum of 1 and infimum of 0, though (ie. the asymptotes of this function are y=0 and y=1). 220.127.116.11 10:03, 23 July 2006 (UTC)
Maybe it's for complex argument values? One is led to think of reals only because the plot is 2d, but maybe the text doesn't assume that. Coffee2theorems 18:22, 23 July 2006 (UTC)
I'd like to add a gallery of sigmoid-like curves to this article. The hemoglobin example is a nice one. Any others? --HappyCamper 17:22, 30 March 2007 (UTC)
For the double sigmoid function, do you mean sin?
I also think the double sigmoid function is wrong. What about this one?
or this one:
I see that someone changed the image size recently in order to avoid resolution problems. Maybe the image should be replaced after all with the almost identical vector image ?--Hagman-de 15:53, 16 June 2007 (UTC)
a slightly more useful definition?
I've used the sigmoid function on and off, for a long time (about 8 years), and what I use is of course similar to what is presented here, but I would suggest adding two elements into the definition -- a "gain" or "sharpness" factor "k" or "g" -- and a "threshold" or "slider" term that allows the function to be "slid" back and forth across the X-axis:
- Y(t) = 1/(1 + e(k*(X - thr))
The neat thing about this more expanded definition is the following:
- The "gain" at X = "thr", is the derivative of course, but it is 1/4 the value of k (as I remember)
- The curve can be "flipped around" by changing the sign of k; thus the sigmoid can be made to act like a Boolean NOT if "thr" is 0.5 and k is positive,
- You see the failure of "the law of excluded middle" (LoEM) -- no matter how huge the k, the value of the function at X = "thr" = 0.5. This violates the LoEM.
- You can build e.g. an OR gate by adding X1 and X2, subtracting "thr" = 0.5 and then squashing the sum with the sigmoid:
- OR(X1, X2) = 1/(1 + e(-12*(X1 + X2 - 0.5)))
- Given that you can build an OR and a NOT you now can approximate any Boolean function.
- Similarly, in a plane, the value of Z(t) will be 0.5 all along a line (it looks like a folded plane)
- From Y = mX + b,
- Y/b + (m/b)*X = 0
- Z(t) = sig(Y/b + (m/b)*X - thr)
- Two of the above Z(t) but with reversed signs and slightly offset with different thresholds added together make a line, like a mountain range on a map, or a canyon. However, If you put three of these plateaus i.e. "folded sheets" (for a total of just 3 sigmoids) on the X-Y plane and get the signs of their k's right, add them together and pass them through a "second-layer" sigmoid you have a "triangle" that can be shrunk with higher values of k's make a single Matterhorn stick up anywhere on the plane (or make a sink-hole).
- Given that you can make Matterhorns to your heart's content anywhere on the plane, you can add them together and approximate any curve by "bleeding" one into another. This summation proves that sigmoids can be used to approximate any arbitrary curve, much like a 2-D Fourier transform.
Some of this stuff can be found in a book titled:
- Tom M. Mitchell, Machine Learning, WCB-McGraw-Hill, 1997, ISBN 0-07-042807-7
In particular see "Chapter 4: Artificial Neural Networks" where the Boolean abilities of "perceptrons" are defined as well. I happened onto the tricky business of adding three folded planes together to make a "triangle" (and passing them through a second-layer sigmoid) because a neural net showed me this (!). I've not seen it documented anywhere, but I did see the results of it in a journal once. I'm sure someone who knows the literature better could cite the source. Proofs similar to the above are mentioned in Mitchell. This stuff is easy to do in Excel. wvbaileyWvbailey 18:39, 17 June 2007 (UTC)
I wonder if it would be useful to list the following function among the sigmoids:
I have seen it used as a "hack" when a fast S-shaped function was needed, avoiding the (computer) evaluation of exp(x). Its derivative is flat at 0 and 1, and it is symmetrical with respect to the midpoint (meaning, ). For many purposes it works fine, as long as you don't run outside the range [0,1]. —Preceding unsigned comment added by Pasmao (talk • contribs) 12:44, 27 October 2007 (UTC)
- It would be interesting to add something like this. I fiddled with this notion with respect to what would be required for mother nature to build a squasher for making neuralogical ANDs and ORs, and was able to get to some pretty nice approximations -- as long as you stay within the interval. Somewhere I actually worked out the math for this ... a problem arises because, to be useful, the AND etc needs some "gain" in the middle (i.e. a slope > 1) but the more gain you put in the more difficult the design becomes. For an OR you need a range of -0.25 to +2.25 (i.e. if inputs are "a" and "b" that vary from 0 to 1, add them and squash their sum back to approximately 0 or 1). The first hack starts out with the odd function y = 1*(x-0.5) + 0.5 (just a straight line shifted to the right: yielding (0,0), (1,1) ). This clearly won't work. The trick then is to feedback a certain amount of x2 to give you some "gain", etc, etc. As I remember this works best if it goes through two iterations. I'm working from memory here... bill Wvbailey (talk) 17:18, 13 January 2008 (UTC)
This is nice! Actually you can generalize it
I'm pretty sure that not all sigmoid functions have the derivative:
This formula is only for tanh for example has a derivative of 1-tanh^2. This is also confusing as f(...) can be mistaken for applying function f to (...) where in this case it means the result of multiplying function f with 1-f. dP/df = (P)*(1-P) would be clearer.
Jfmiller28 (talk) 23:09, 2 January 2008 (UTC)
- is not even the special case of the logistic function mentioned in the text. How the reader could know what function the formula applies to. This part of the text is very confusing.18.104.22.168 (talk) 14:36, 7 January 2008 (UTC)
Are some of the sections talking about the logistic?
- My text by Mitchell, which I listed on the article page (the only reference, BTW), equates the two:
- "σ(y) = 1/(1+e-y)
- "σ is often called the signmod function or, alternately, the logistic function. Note that its output ranges between 0 and 1 .... Because it it maps a very large input domain to a small range of outputs, it is often referred to as the squashing function of the unit [cf Figure 4.6 The sigmoid threshold unit; in this drawing, σ(y) = 1/(1+e-net), where net = Σ0i(wi*xi) and wi is the ith weight for the ith input xi and x0 is a constant -- x0 is important(!)]. The sigmoid function has the useful property that its derivative is easily expressed in terms of its output..." (Mitchell 1997:96-97)
- My guess is writers who distinguish between the two are (needlessly) splitting the hare (hair) and using two different names for the same function depending on where it is used. "Logistic" would seem to come from "logic" i.e. having 1 and 0 outcomes only; "Sigmoid" because of its shape as in "sigmoidoscopy". Anyway, as this is wikipedia and we need sources to back up our claims, mine says they are the same thing. Bill Wvbailey (talk) 15:07, 30 May 2008 (UTC)
An external link that has been in this page for a good while  points to a very useful implementation of the model in Excel. I've found the author moved the site so the link redirects to  I tried to update and now a user MrOllie has been deleting this edit, pointing to WP:EL (policy on external links) I appreciate his point, as in many topic of opinion, blogs are not an authoritative source. As this article is about math, I can't see the difference between the resource  and . Both describe in further detail the logistic curve and show someone interested how to implement it. I find the Excel version very useful and relevant, and from comments in , some other people find it useful as well. Would like to see other members opinions 22.214.171.124 (talk) 17:46, 31 October 2008 (UTC)
Sigmoid and sigmoidal
the following is called a "sigmoidal function" in another article:
- Sigmoidal and sigmoid seem to me as they are the same thing; maybe there is some very slight difference but it's not pointed out by the article. --Kri (talk) 00:38, 16 February 2011 (UTC)
Sign of first derivative
Currently the section "Definition" says
- A sigmoid function is a bounded differentiable real function that is defined for all real input values and has a positive derivative at each point.
but then in the very next sentence the section "Properties" says
- In general, a sigmoid function is real-valued and differentiable, having either a non-negative or non-positive first derivative which is bell shaped.
So the article is inconsistent as to whether it must be upward sloping or whether it can alternatively be downward sloping, and (if downward sloping is precluded) as to whether it must have a positive or just a non-negative derivative. Duoduoduo (talk) 14:15, 2 November 2013 (UTC)