Talk:Elo rating system

From Wikipedia, the free encyclopedia
Jump to: navigation, search


          This article is of interest to the following WikiProjects:
WikiProject Chess (Rated C-class, Top-importance)
WikiProject icon This article is within the scope of WikiProject Chess, a collaborative effort to improve the coverage of Chess on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
C-Class article C  This article has been rated as C-Class on the project's quality scale.
 Top  This article has been rated as Top-importance on the project's importance scale.
 
WikiProject Board and table games (Rated C-class, Mid-importance)
WikiProject icon This article is part of WikiProject Board and table games, an attempt to better organize information in articles related to board games and tabletop games. If you would like to participate, you can edit the article attached to this page, or visit the project page, where you can join the project and/or contribute to the discussion.
C-Class article C  This article has been rated as C-Class on the quality scale.
 Mid  This article has been rated as Mid-importance on the importance scale.
 

Why the Logistic Function?[edit]

The page assumes that the use of the logistic function in Elo's formula is obvious but it really isn't. Indeed its straightforward to show that Elo's assumption that the performance is normally distributed means that ratings should follow a cumulative normal distribution. The logistic function is a decent estimate of that, of course, but there is no mention of Elo's justification. — Preceding unsigned comment added by 96.242.81.223 (talk) 15:00, 19 June 2016 (UTC)

Explain exactly how Elo's original formula assumes a normal distribution.Anonywiki (talk) 15:43, 16 September 2016 (UTC)

Elo's theory is based on Pairwise comparisons. A first attempt -and subsequently forgotten- to develop a theory of chess ratings was made by Ernst Zermelo, using the New York 1924 chess tournament as an example. See Die Berechnung der Turnier-Ergebnisse als ein Maximumproblem der Wahrscheinlichkeitsrechnung, Mathematische Zeitschrift 29, 1929, S. 436–460
Elo chooses the Gaussian distribution "after extensive investigation". Elo did consider other distributions: Verhulst, Perks, rectangular and lineair, binomial and Maxwell Bolzmann. As the Elo system is self-correcting, the actual distribution is not that important, as long as the distribution is monotonous and continuous.
Clpippel (talk) 15:07, 8 November 2016 (UTC)

Dubious[edit]

Pacerier (talk) 01:32, 30 June 2015 (UTC): ❝

The page states "Three votes are cast for each photograph and an Elo score is determined for all photographs". Does this even make sense? There are only so much variations you can have with three votes and two options.
Let's take an example:
Photo 1: Keep, Throw, Throw
Photo 2: Keep, Keep, Throw
Photo 3: Keep, Throw, Keep,
Photo 4: Keep, Keep, Keep
Photo 5: Throw, Throw, Throw
Photo 6: Keep, Throw, Throw
Photo 7: Keep, Keep, Throw
Photo 8: Keep, Throw, Keep,
Photo 9: Keep, Keep, Keep
What would the "Elo score" for the photographs be?

Intended is the following: any two photo's A and B are compared three times. For example photograph A gets Keep, Throw, Throw. Then photograph B receives Throw, Keep, Keep. Assuming Keep = 1, and Throw = 0, photo B beats photo A by 2 to 1. After a "sufficient" number of comparisons, the total score and the Elo score can be determined in a meaningful way. Clpippel (talk) 13:24, 1 January 2016 (UTC)

External links modified[edit]

Hello fellow Wikipedians,

I have just added archive links to one external link on Elo rating system. Please take a moment to review my edit. If necessary, add {{cbignore}} after the link to keep me from modifying it. Alternatively, you can add {{nobots|deny=InternetArchiveBot}} to keep me off the page altogether. I made the following changes:

When you have finished reviewing my changes, please set the checked parameter below to true to let others know.

Question? Archived sources still need to be checked

Cheers. —cyberbot IITalk to my owner:Online 22:39, 28 August 2015 (UTC)

External links modified[edit]

Hello fellow Wikipedians,

I have just added archive links to one external link on Elo rating system. Please take a moment to review my edit. If necessary, add {{cbignore}} after the link to keep me from modifying it. Alternatively, you can add {{nobots|deny=InternetArchiveBot}} to keep me off the page altogether. I made the following changes:

When you have finished reviewing my changes, please set the checked parameter below to true to let others know.

Question? Archived sources still need to be checked

Cheers.—cyberbot IITalk to my owner:Online 11:18, 2 January 2016 (UTC)

Significance of range of ratings for a game?[edit]

Is there anything useful to be said (in this article) about the range of ratings, in particular the highest rating achieved, for a given game? I see that the record chess rating achieved is 2882 by Magnus Carlsen in 2014(List_of_chess_players_by_peak_FIDE_rating), while I see in the paper on Alpha Go that top ranked (9 dan professional) Go players are ranked around 3500. Does this permit or suggest any conclusions about the difference in the games, or in the level to which they are played? PJTraill (talk) 23:03, 31 January 2016 (UTC)

No it does not. The KNDB rating of the strongest Draughts player is 1579. These absolute numbers are meaningless. It is about the rating differences at the same point in time, and in the same rating pool. From the record chess rating list, we can derive that Carlson is 31 rating points stronger then Caruana, which gives him, according to the Fide table, an expected score advantage of 53% / 47%.
Is Carlson stronger then Fischer? To make an educated guess, we have to consider the rating differences of Fischer and his competitors, and compare that with the rating differences between Carlson and his competitors. IJmuiden, Clpippel (talk) 18:28, 2 February 2016 (UTC)

non commutative - is that correct?[edit]

The article claims Elo ratings are non-commutative. The only source for this is a single line in a slide presentation (and apparently a graduate student talk, so not a WP:RS). I would have thought that this was incorrect, because Elo ratings are updated in a block, month by month. Isn't it true that start-of-the-month ratings (rather than "live" ratings) are used in each month's calculation? Adpete (talk) 23:07, 29 March 2016 (UTC)

FIDE Rating Regulations effective from 1 July 2014, section 8.55(c), says, "(c) ΣΔR x K = the Rating Change for a given tournament, or Rating period." That says to me that the ratings are updated at the end of a ratings period (i.e. monthly). And therefore there are commutative, within a given ratings period. Adpete (talk) 23:18, 29 March 2016 (UTC)

(24 hours later) In the absence of any discussion, I will delete the entire section shortly. The reason being: I see no evidence of commutativity being discussed in any WP:Reliable Source, so this article should not discuss it either. Adpete (talk) 23:37, 30 March 2016 (UTC)

Hi Adpete, I'm sorry that the section has to go. Though the phenomenon seems obvious to me, I can't find any reliable source (http://blog.daave.com/2011/06/adventures-with-google-app-engine.html being a blog). Just a comment about the FIDE regulation: even if the rating update is commutative within a period, it would still not be commutative between 2 different periods. Also 24 hours is not much time, especially several Wikimania deadlines fell yesterday. Anyway, remove it if you must. Cheers, cmɢʟeeτaʟκ 19:38, 31 March 2016 (UTC)

The "non-commutative" part is correct on online web servers such as Internet Chess Club where ratings are updated immediately after each game. However, for ratings that matter this is generally a non-issue. For the non-commutative part to hold in the FIDE system the games between the players would have to be in different months. Also, since tournaments are not rated before they are completed the games would also have to be in separate tournaments. On the whole I support Adpete's decision since the effect of the non-commutativeness is of very minor consequence in the few cases where it is seen. Sjakkalle (Check!) 16:18, 1 April 2016 (UTC)

To me, the issue is whether it's covered in WP:Reliable Sources. I'm more than happy to have a little bit about it if RSs cover it. If no RSs cover, that means it's a mathematical curiosity which is true (for ratings which are continuously updated), but no one cares about. Adpete (talk) 11:31, 2 April 2016 (UTC)