User:Nog33/Preferential attachment

From Wikipedia, the free encyclopedia

Preferential attachment is the name given to a process in which some quantity, typically some form of wealth or credit, is distributed among a growing number of individuals or objects according to how much they already have, so that those who are already wealthy receive more than those who are not. "Preferential attachment" is only the most recent of many names that have been given to this process. It is also referred to as a "Yule process", "Gibrat's law", "cumulative advantage", "the rich get richer", and, less correctly, the "Matthew effect".

Definition[edit]

A preferential attachment process is a stochastic urn process characterized by two defining features:

  • the number of urns increases over time and the total amount of wealth distributed among them similarly increases;
  • the newly added wealth is distributed among urns as some increasing function of the amount they already have.

A classic example of a preferential attachment process is the growth in the number of species per genus in some higher taxon of biotic organisms. New genera ("urns") are added to a taxon whenever a new species is considered sufficiently different from its predecessors that it does not belong in any extant genera. New species ("wealth") are added as old ones speciate and, assuming that most species belong to the same genus as their parent, the rate at which species are added to existing genera will be proportional the number of species a genus already has. This process, first studied by Yule, is thus a linear preferential attachment process, since the rate at which genera accrue new species is simply proportional to the number they already have.

The linear preferential attachment process is known to produce a distribution of species over genera following the so-called Yule-Simon distribution. If species are added to the system at an overall rate of m per new genus and added to specific genera at a rate proportional to the number k that those genera already have, plus a constant a>0, then the fraction P(k) of genera having k species in the limit of long times is given by

where B(a,b) is Legendre's beta function:

with Γ(a) being the standard gamma function, and

The beta function behaves asymptotically as B(a,b) ~ a-b for fixed b and large a, which implies that for large values of k we have

In other words, the preferential attachment process generates a "long-tailed" distribution following a Pareto distribution or power law in its tail. This is the primary reason for the historical interest in preferential attachment: the species distribution and many other phenomena are observed empirically to follow power laws and the preferential attachment process is a leading candidate mechanism to explain this behavior. Preferential attachment is considered a possible candidate for, among other things, the distribution of the sizes of cities, the wealth of extremely wealthy individuals, the number of citations received by learned publications, and the number of links to pages on the world wide web.

Preferential attachment is sometimes referred to as the Matthew effect, but the two are not precisely equivalent. The Matthew effect, first discussed by Robert Merton[1], is named for a passage in the biblical Gospel of Matthew: "For unto every one that hath shall be given, and he shall have abundance: but from him that hath not shall be taken away even that which he hath." (Matthew 25:29, King James Version.) The preferential attachment process does not incorporate the taking away part. An urn process that includes both the giving and the taking away would produce a log-normal distribution rather than a power law. This point may be moot, however, since the scientific insight behind the Matthew effect is in any case entirely different. Qualitatively it is intended to describe not a mechanical multiplicative effect like preferential attachment but a specific human behavior in which people are more likely to give credit to the famous than to the little known. The classic example of the Matthew effect is a scientific discovery made simultaneously by two different people, one well known and the other little known. It is claimed that under these circumstances people tend more often to credit the discovery to the well-known scientist. Thus the real-world phenomenon the Matthew effect is intended to describe is quite distinct from (though certainly related to) preferential attachment.

History[edit]

The first rigorous consideration of preferential attachment seems to be that of Yule in 1925, who used it to explain the power-law distribution of the number of species per genus of flowering plants[2]. The process is sometimes called a "Yule process" in his honor. Yule was able to show that the process gave rise to a distribution with a power-law tail, but the details of his proof are, by today's standards, contorted and difficult, since the modern tools of stochastic process theory did not yet exist and he was forced to use more cumbersome methods of proof.

Most modern treatments of preferential attachment make use of the master equation method, whose use in this context was pioneered by Simon in 1955, in work on the distribution of sizes of firms[3].

The first application of preferential attachment to learned citations was given by Price in 1976[4]. (He referred to the process as a "cumulative advantage" process.) His was also the first application of the process to the growth of a network, producing what would now be called a scale-free network. It is in the context of network growth that the process is most frequently studied today. Price also promoted the preferential attachment as a possible explanation for power laws in many other phenomena, including Lotka's law of scientific productivity and Bradford's law of journal use.

The application of preferential attachment to the growth of the world wide web was proposed by Barabási and Albert in 1999[5]. Barabási and Albert also coined the name "preferential attachment" by which the process is best known today and suggested that the process might apply to the growth of many other networks as well.

See also[edit]

References[edit]

  1. ^ Merton, Robert K. (1968). "The Matthew effect in science". Science. 159 (3810): 56–63. doi:10.1126/science.159.3810.56.
  2. ^ Yule, G. U. (1925). "A Mathematical Theory of Evolution, based on the Conclusions of Dr. J. C. Willis, F.R.S.". Hilosophical Transactions of the Royal Society of London, Ser. B. 213: 21–87.
  3. ^ Simon, H. A. (1955). "On a class of skew distribution functions". Biometrika. 42 (3–4): 425–440. doi:10.1093/biomet/42.3-4.425.
  4. ^ Price, D. J. de S. (1976). "A general theory of bibliometric and other cumulative advantage processes". J. Amer. Soc. Inform. Sci. 27 (5): 292–306. doi:10.1002/asi.4630270505.
  5. ^ Barabási, A.-L. (1999). "Emergence of scaling in random networks". Science. 286 (5439): 509–512. doi:10.1126/science.286.5439.509. PMID 10521342. {{cite journal}}: Unknown parameter |coauthors= ignored (|author= suggested) (help)