Jump to content

Talk:Exponential smoothing: Difference between revisions

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia
Content deleted Content added
Line 92: Line 92:


: [[WP:SOFIXIT]] --[[User:Muhandes|Muhandes]] ([[User talk:Muhandes|talk]]) 10:03, 20 August 2010 (UTC)
: [[WP:SOFIXIT]] --[[User:Muhandes|Muhandes]] ([[User talk:Muhandes|talk]]) 10:03, 20 August 2010 (UTC)

== last square optimisation of alpha ==

I do not understand exactly why optimizing alpha using LS methods should work. Sum of squares of differences is minimized for alpha=1, and it equals 0. By continuity optimization problem I suppose there is no other non-trivial optimisation solutions. Please give some citation/reference.

Revision as of 18:25, 15 December 2010

WikiProject iconStatistics Unassessed
WikiProject iconThis article is within the scope of WikiProject Statistics, a collaborative effort to improve the coverage of statistics on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
???This article has not yet received a rating on Wikipedia's content assessment scale.
???This article has not yet received a rating on the importance scale.

The original version

18:03, 11 August 2006 (UTC)141.122.9.165This article is giving an example of using a general technique, even though it says it is a bad exmaple. RJFJR 17:18, 5 March 2006 (UTC)[reply]

The whole thing reads like a student essay. --A bit iffy 15:17, 27 April 2006 (UTC)[reply]

I do not have access to recent research paper etc., but I can give the following reference: Montgomery, Douglas C "Forecasting and time series analysis" 1976, McGraw-Hill Inc. martin.wagner@nestle.com

It's pretty messy. I've seen worse, though. One of the early formulas seems right, but I haven't looked closely at the text yet. Michael Hardy 20:59, 23 October 2006 (UTC)[reply]

For such a widely adopted forecasting method, this is extremely poor. At the very least, standard notation should be adopted, see Makridakis, S., Wheelwright, S. C., & Hyndman, R. J. (1998). Forecasting: Methods and applications (3rd ed.). New Jersey: John Wiley & Sons. State-space notation would also be a useful addition. Dr Peter Catt 02:29, 15 December 2006 (UTC)[reply]

I feel really sorry to see poor work like this on Wiki.

A complete rewrite

OK, nobody liked this article very much, and it even came up over on this talk page. So I've rewritten it in its entirety. I'll try to lay my hands on a few reference books in the next month or so, so that I can verify standard notation, etc. Additional information about double and triple exponential smoothing should also go in this article, but at least I've made a start. DavidCBryant 00:45, 10 February 2007 (UTC)[reply]

Problems with weighting

Now that the article has been re-written to be much clearer, I see that there are some problems which I had not previously noticed.

  1. Currently, the way the average is initialized gives much too much weight to the first observation. If α is 0.05 (corresponding to a moving average of about 20 values), when you initialize and then input a second observation, the average is (1*x1 + 19*x0)/20 which gives the first observation 19 times the weight of the second when it would be more appropriate to give the first observation 19/20 of the weight of the second observation.
  2. No provision is made for the practical problem of missing data. If an observation is not made at some time, or it is made but lost, then what do we do?

Perhaps these difficulties could be addressed, in part, by separately computing a normalization factor which could be done by forming a sum in the same way using always 1 as the data and then dividing that into the sum of the actual observations. JRSpriggs 03:45, 12 February 2007 (UTC)[reply]

Do you see this as a problem with the article itself, or a problem with the statistical technique described in the article? —David Eppstein 08:06, 12 February 2007 (UTC)[reply]
Well, that is the question, is it not? I do not know enough about statistics to know whether the technique has been described incorrectly or whether we should point out that the technique has these limitations. JRSpriggs 09:59, 12 February 2007 (UTC)[reply]
When I learned about this technique, I think I remember learning that either of the two methods could be used to initialize it (either copy the first data point enough times to fill in the array, of copy the most recent data point enough times). But I have no idea where I learned about this and I have no references on it, so I can't say what is done in practice. The article on moving average also has no discussion of how to initialize the array. It's hardly a limitation on the method because the method is intended for large data sets, not tiny ones. There seem to be some references at moving average like this one. CMummert · talk 13:27, 12 February 2007 (UTC)[reply]

Perhaps we should consider merging this article into Moving average#Exponential moving average. Otherwise, I think that the weighting should be more like this:

where the sum is over observations with What do you-all think? JRSpriggs 04:35, 13 February 2007 (UTC)[reply]

I agree on both counts, particularly the merge. MisterSheik 18:47, 16 February 2007 (UTC)[reply]

The exponential moving average

The article says: For example, the method of least squares might be used to determine the value of α for which the sum of the quantities (sn − xn)^2 is minimized. Uh? I think such would be one! Albmont 11:14, 28 February 2007 (UTC)[reply]

Thanks. It should have been sn-1 instead of sn. I changed it. JRSpriggs 12:25, 28 February 2007 (UTC)[reply]

Is right?

Should it be

or

?

This indicates the first one, but this PowerPoint presentation implies the second one.

If the formula is used for forecasting purposes, then it looks to me like the second one is the only usable one (and also looks more natural somehow). Or am I missing something? (I'm very, very rusty on this now!). Hope someone can clear this up.--A bit iffy 11:18, 3 March 2007 (UTC)[reply]

The choice is a matter of convention (for strictly periodic data), but I think that the first one (which we use here) is more natural, since people would compute the smoothed value as soon as possible and would naturally want to label it with the time that they computed it. JRSpriggs 11:27, 4 March 2007 (UTC)[reply]


The textbooks that I use for teaching time series use the second one (with the lagged value of the series). See Moore, McCabe, Duckworth and Sclove, The Practice of Business Statistics. In Minitab, on the other hand, the smoothed value at time t is defined using the first formula, while the fitted value at time t is the smoothed value at time (t-1). EconProf86 16:20, 28 May 2007 (UTC)[reply]

Negative values for smoothing factor α

All references I have looked at suggest that the value of α must be chosen between 0 and 1. However, none offer any reason for this. Although such a range may be "intuitive", I have worked with datasets for which the optimal value for α (in a least-squares sense, as described in the article) is negative. Why would this be wrong? koochak 10:30, 5 March 2008 (UTC)[reply]

Look at the meaning of the α. It is a percentage of the smoothed value that should be generated using the previous smoothed value. You cannot have a negative percentage. JLT 15:03, 16 Dec 2009 (CST)

Corrected an Error

I removed an innacuracy that stated that simple exponential smoothing was the same as Brown exponential smoothing. This is not the case; Brown's method is double exponential smoothing. JLT 1451, 16 Dec 1009 (CST) —Preceding unsigned comment added by 131.10.254.62 (talk)

Unsatisfying Derivation of Alpha

The statement that there is no simple way to choose α is very unsatisfying.

If one considers the impulse-response of this method, then the time delay of the response (mean) is 1/α data points and the rms width of the response is also on the order of (but not exactly) 1/α data points. Thus the method smooths with a smoothing width of 1/α data points, and this is a perfectly good way to choose an α.

208.252.219.2 (talk) 16:01, 18 August 2010 (UTC)[reply]

WP:SOFIXIT --Muhandes (talk) 10:03, 20 August 2010 (UTC)[reply]

last square optimisation of alpha

I do not understand exactly why optimizing alpha using LS methods should work. Sum of squares of differences is minimized for alpha=1, and it equals 0. By continuity optimization problem I suppose there is no other non-trivial optimisation solutions. Please give some citation/reference.