Sports rating system

From Wikipedia, the free encyclopedia
Jump to: navigation, search

A sports rating system is a system that analyzes the results of sports competitions to provide objective ratings for each team or player. Rankings are then derived by sorting each team's ratings and assigning an ordinal rank to each team starting with the highest rated team earning the #1 rank. Ratings systems provide an alternative to traditional sports standings which are based on win-loss-tie ratios, or to polls (sometimes called "power rankings" in sports journalism) which are a subjective rating of the teams in a league.

In the United States, the biggest use of sports ratings systems is to rate NCAA Division I FBS Football teams towards choosing two teams to play in the BCS championship game. Sports ratings systems are also used to help determine the field for the NCAA Men's and Women's Basketball tournaments, men's professional golf tournaments, professional tennis tournaments, and NASCAR. They are often mentioned in discussions about the teams that could or should receive "at large" invitations to participate in certain contests ("bubble teams").[1]


Sports ratings systems have been around for almost 80 years, when ratings were calculated on paper rather than by computer, as most are today. Some older computer systems still in use today include: Jeff Sagarin's systems, the New York Times system, and the Dunkel Index, which dates back to 1929.


Sports ratings systems use a variety of methods for rating teams, but the most prevalent method is called a power rating. The power rating of a team is a calculation of the team's strength relative to other teams in the same league or division. The basic idea is to maximize the amount of transitive relations in a given data set due to game outcomes. For example, if A defeats B and B defeats C, then one can safely say that A>B>C.

There are obvious problems with basing a system solely on wins and losses. For example, if C defeats A, then an intransitive relation is established (A>B>C>A) and a ranking violation will occur if this is the only data available. Scenarios such as this happen fairly regularly in sports—for example, in the 2005 NCAA Division I-A football season, Penn State beat Ohio State, Ohio State beat Michigan, and Michigan beat Penn State. To address these logical breakdowns, rating systems usually consider other criteria such as the game's score and where the match was held (for example, to assess a home field advantage). In most cases though, each team plays a sufficient amount of other games during a given season, which lessens the overall effect of such violations.

From an academic perspective, the use of linear algebra and statistics are popular among many of the systems' authors to determine their ratings. They are obvious choices for providing a solution to such a problem, however to effectively use these fields typically requires education in them at a collegiate level. Not surprising then is the fact that many of the systems' authors have backgrounds in these areas, such as Jeff Sagarin, who received a Bachelor of Science in Mathematics from MIT in 1970.

Pros and cons of sports rating systems[edit]


  • Ratings are objective, without specific player, team, regional, or style bias
  • Ratings are verifiable and repeatable
  • Ratings are comprehensive, requiring assessment of all selected criteria
  • Ratings do not "forget old games" (although some are designed to diminish their overall weight as a season progresses)


  • Formulae criteria are subjective (some use just scores, some use margin of victory, some include auxiliary game data, such as hits, interceptions, lead changes, shot percentage) with no consensus among authors on which to include, even minimally
  • Most ignore qualitative criteria such as weather, participation (or lack thereof due to injuries or "throw-away" games—see below), and individual efforts
  • Some assume parity among all members of the league, such as each team being built from an equitable pool of players via a draft or free agency system as is done in many major league sports such as the NFL, MLB, NBA, and NHL. This is certainly not the case in collegiate leagues such as Division I-A football or men's and women's basketball.
  • If sufficient "inter-divisional" league play is not accomplished, teams in an isolated division may be artificially propped up or down in the overall ratings due to a lack of correlation to other teams in the overall league. This phenomenon is evident in systems that analyze historical college football seasons, such as when the top Ivy League teams of the 1970s, like Dartmouth, were calculated by some rating systems to be comparable with accomplished powerhouse teams of that era such as Nebraska, USC, and Ohio State. This conflicts with the subjective opinion that claims that while good in their own right, they were not nearly as good as those top programs. However, this may be considered a "pro" by non-BCS teams in Division I-A college football who point out that ratings systems have proven that their top teams belong in the same strata as the BCS teams. This is evidenced by the 2004 Utah team that went undefeated in the regular season and earned a BCS bowl bid due to the bump in their overall BCS ratings via the computer ratings component. They went on to play and defeat the Big East Conference champion Pittsburgh in the 2005 Fiesta Bowl by a score of 35-7. A related example occurred during the 2006 NCAA Men's Basketball Tournament where George Mason were awarded an at-large tournament bid due to their regular season record and their RPI rating and rode that opportunity all the way to the Final Four.
  • Goals of some rating systems differ from one another. For example, systems may be crafted to provide a perfect retrodictive analysis of the games played to-date, while others are predictive and give more weight to future trends rather than past results. This results in the potential for misinterpretation of rating system results by people unfamiliar with these goals; for example, a rating system designed to give accurate point spread predictions for gamblers might be ill-suited for use in selecting teams most deserving to play in a championship game or tournament. Additionally, this issue limits the use of ratings in combination with each other as part of a consensus rating, such as the BCS computer component is supposed to provide, as the results may be skewed and exhibit unacceptably large deviations from the overall average ranking per team.
  • Rating systems can not ignore "throw-away" games. These are games where teams have already earned a post-season bid and have secured their playoff seeding before the end of the regular season, and want to rest/protect their starters by benching them for those remaining regular season games. This usually results in unpredictable outcomes, but without a mechanism to ignore such games (which runs counter to the goals of such systems), this will unintentionally skew the outcomes of rating systems.

List of sports rating systems[edit]


United States[edit]

Ratings of other contests[edit]


  1. ^ Fagan, Ryan (2011-03-09), "Sorting through teams on one big bubble", Sporting News, retrieved 2011-03-24, "This is a look at 20 of the teams (in alphabetical order) residing on this year’s big ol’ bubble. We’ve included three statistical rankings. The RPI (ratings percentage index, taken from is considered the standard and is provided to committee members during the selection process. The two other ranking indexes include margin of victory in their formulas—the Pomeroy ratings (at and Sagarin ratings (via USA Today)—aren’t new but have played an increased role in discussions about potential seeds during this college basketball season." 

External links[edit]