Rank-size distribution

From Wikipedia, the free encyclopedia
Jump to: navigation, search
Rank-size distribution of the population of countries follows a stretched exponential distribution[1] except in the cases of the two "Kings": China and India.

Rank-size distribution is the distribution of size by rank, in decreasing order of size. For example, if a data set consists of items of sizes 5, 100, 5, and 8, the rank-size distribution is 100, 8, 5, 5 (ranks 1 through 4). This is also known as the rank-frequency distribution, when the source data are from a frequency distribution. These are particularly of interest when the data vary significantly in scale, such as city size or word frequency. These distributions frequently follow a power law distribution, or less well-known ones such as a stretched exponential function or parabolic fractal distribution, at least approximately for certain ranges of ranks; see below.

A rank-size distribution is not a probability distribution or cumulative distribution function. Rather, it is a discrete form of a quantile function (inverse cumulative distribution) in reverse order, giving the size of the element at a given rank.

Simple rank-size distributions[edit]

In the case of city populations, the resulting distribution in a country, a region, or the world will be characterized by its largest city, with other cities decreasing in size respective to it, initially at a rapid rate and then more slowly. This results in a few large cities and a much larger number of cities orders of magnitude smaller. For example, a rank 3 city would have one-third the population of a country's largest city, a rank 4 city would have one-fourth the population of the largest city, and so on.[citation needed][dubious ]

When any log-linear factor is ranked, the ranks follow the Lucas numbers, which consist of the sequentially additive numbers 1, 3, 4, 7, 11, 18, 29, 47, 76, 123, 199, etc. Like the more famous Fibonacci sequence, each number is approximately 1.618 (the Golden ratio) times the preceding number. For example, the third term in the sequence above, 4, is approximately 1.6183, or 4.236; the fourth term, 7, is approximately 1.6184, or 6.854; the eighth term, 47, is approximately 1.6188, or 46.979. With higher values, the figures converge. An equiangular spiral is sometimes used to visualize such sequences.

Rank-size rule[edit]

The rank-size rule (or law), describes the remarkable regularity in many phenomena, including the distribution of city sizes, the sizes of businesses, the sizes of particles (such as sand), the lengths of rivers, the frequencies of word usage, and wealth among individuals.

All are real-world observations that follow power laws, such as Zipf's law, the Yule distribution, or the Pareto distribution. If one ranks the population size of cities in a given country or in the entire world and calculates the natural logarithm of the rank and of the city population, the resulting graph will show a log-linear pattern.[dubious ] This is the rank-size distribution.[2]

Theoretical rationale[edit]

One study claims that the rank size rule "works" because it is a "shadow" or coincidental measure of the true phenomenon.[3] The true value of rank size is thus not as an accurate mathematical measure (since other power-law formulas are more accurate, especially at ranks lower than 10) but rather as a handy measure or “rule of thumb” to spot power laws. When presented with a ranking of data, is the third-ranked variable approximately one-third the value of the highest-ranked one? Or, conversely, is the highest-ranked variable approximately ten times the value of the tenth-ranked one? If so, the rank size rule has possibly helped spot another power law relationship.

Known exceptions to simple rank-size distributions[edit]

While Zipf's law works well in many cases, it tends to not fit the largest cities in many countries; one type of deviation is known as the King effect. A 2002 study found that Zipf’s Law was rejected for 53 of 73 countries, far more than would be expected based on random chance.[4] The study also found that variations of the Pareto exponent are better explained by political variables than by economic geography variables like proxies for economies of scale or transportation costs.[5] A 2004 study showed that Zipf's law did not work well for the five largest cities in six countries.[6] In the richer countries, the distribution was flatter than predicted. For instance, in the United States, although its largest city, New York City, has more than twice the population of second-place Los Angeles, the two cities' metropolitan areas (also the two largest in the country) are much closer in population. In metropolitan-area population, New York City is only 1.3 times larger than Los Angeles. In other countries, the largest city would dominate much more than expected. For instance, in the Democratic Republic of the Congo, the capital, Kinshasa, is more than eight times larger than the second-largest city, Lubumbashi. When considering the entire distribution of cities, including the smallest ones, the rank-size rule does not hold. Instead, the distribution is log-normal. This follows from Gibrat's law of proportionate growth.

Because exceptions are so easy to find, the function of the rule for analyzing cities today is to compare the city-systems in different countries. The rank-size rule is a common standard by which urban primacy is established. A distribution such as that in the United States or China does not exhibit a pattern of primacy, but countries with a dominant "primate city" clearly vary from the rank-size rule in the opposite manner. Therefore, the rule helps to classify national (or regional) city-systems according to the degree of dominance exhibited by the largest city. Countries with a primate city, for example, have typically had a colonial history that accounts for that city pattern. If a normal city distribution pattern is expected to follow the rank-size rule (i.e. if the rank-size principle correlates with central place theory), then it suggests that those countries or regions with distributions that do not follow the rule have experienced some conditions that have altered the normal distribution pattern. For example - the presence of multiple regions within large nations such as China and the United States tends to favor a pattern in which more large cities appear than would be predicted by the rule. By contrast, small countries that had been connected (e.g. colonially/economically) to much larger areas will exhibit a distribution in which the largest city is much larger than would fit the rule, compared with the other cities—the excessive size of the city theoretically stems from its connection with a larger system rather than the natural hierarchy that central place theory would predict within that one country or region alone.


  1. ^ "Stretched exponential distributions in nature and economy: "fat tails" with characteristic scales", J. Laherrère and D. Sornette
  2. ^ Zipf's Law, or the Rank-Size Distribution Steven Brakman, Harry Garretsen, and Charles van Marrewijk
  3. ^ The Urban Rank-Size Hierarchy James W. Fonseca
  4. ^ Kwok Tong Soo (2002)
  5. ^ Zipf's Law, or the Rank-Size Distribution
  6. ^ Cuberes, David, The Rise and Decline of Cities, University of Chicago, September 29, 2004

Further reading[edit]

See also[edit]