# Talk:Galton–Watson process

WikiProject Human Genetic History (Rated B-class, High-importance)
This article is within the scope of WikiProject Human Genetic History, a collaborative effort to improve the coverage of genetic genealogy, population genetics, and associated theory and methods articles on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
B  This article has been rated as B-Class on the quality scale.
High  This article has been rated as High-importance on the importance scale.
WikiProject Genetics (Rated B-class, Low-importance)
This article is within the scope of WikiProject Genetics, a collaborative effort to improve the coverage of Genetics on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
B  This article has been rated as B-Class on the project's quality scale.
Low  This article has been rated as Low-importance on the project's importance scale.
WikiProject Statistics (Rated B-class, Low-importance)

This article is within the scope of the WikiProject Statistics, a collaborative effort to improve the coverage of statistics on Wikipedia. If you would like to participate, please visit the project page or join the discussion.

B  This article has been rated as B-Class on the quality scale.
Low  This article has been rated as Low-importance on the importance scale.
WikiProject Mathematics (Rated B-class, Low-importance)
This article is within the scope of WikiProject Mathematics, a collaborative effort to improve the coverage of Mathematics on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
Mathematics rating:
 B Class
 Low Importance
Field: Probability and statistics

The external links are both dead. Need updating. — Preceding unsigned comment added by 137.112.148.179 (talk) 03:00, 16 November 2005‎

no, the second one is ok, but you need Acrobat or something to see it. Not very "accessible" 203.218.141.99 07:32, 11 April 2006 (UTC)

Both links work fine for me. On is a pdf file; one reads it with acrobat. The other is a ps file. On the Linux machine I'm using, with my preferences set as they are, I read it with ghostview. On Microsoft systems, I suspect the appropriate software would be called gsview or something like that. Michael Hardy 20:19, 11 April 2006 (UTC)

## example: one-child policy

Suggest:

"As a concrete example, suppose in one generation there are 100 persons (50 male and 50 female) with unique surnames, and a requirement that every person can participate in the conception and naming of at most one child. Then the next generation can have at most 50 unique surnames."

http://en.wikipedia.org/wiki/One-child_policy — Preceding unsigned comment added by 66.228.79.25 (talk) 20:57, 11 July 2006‎

## Is it worth mentioning the original paper is wrong?

I just went through the Galton-Watson paper mentioned in the history section, which concludes by saying (incorrectly) that "whenever the survival probabilities can be represented by a polynomial, ... all the surnames, therefore, tend to extinction." It comes to this false conclusion because it says,

We get the equation

$_rm_0=\frac{1}{(a+b)^q}\left \{ a+b\cdot_{r-1}m_0\right \}^q$
[Here rm0 denotes the probability of extinction after r generations.]

whence it follows that as r increases indefinitely the value of rm0 approaches indefinitely to the value y where

$y=\frac{1}{a+b}\left \{ a+by\right \}$

that is where y=1.

The first formula is correct, but the second is not; it should be

$y=\frac{1}{(a+b)^q}\left \{ a+by\right \}^q$

this does have y=1 as one solution, but there are other solutions if the expected number of male descendants is greater than 1. Moreover, the sequence (rm0) will approach one of the other solutions, so the probability of the surname surviving indefinitely is nonzero. This is in agreement with the Wikipedia article, but not the Galton-Watson paper.

Meanwhile, all of this is original research, so Wikipedia can't include any of it without a citation. skeptical scientist (talk) 19:53, 31 January 2009 (UTC)

## Definition unclear

The definition uses (n+1) in the superscript, it is not clear what this means. The paragraph below implies is is \xi_j^{(n)} is a sequence. I am not sure what it means to be summing whole sequences (as opposed to elements), especially ones with a different number of elements. I think something else is meant here. If it is not wrong, I think there is something missing in the explanation of the notation. --MATThematical (talk) 16:20, 16 June 2010 (UTC)

I added one sentence describing the correspondence between formal definition and analogy with family names; hopefully this makes the definition easier to parse. I'm not exactly sure if it's fair to call $\{\xi_j^{(n)}\}$ a sequence, as it's a function, not from natural numbers to random variables, but from pairs (j,n) of natural numbers with jXn to random variables (and so not technically a sequence, or perhaps an infinite sequence of finite sequences). However, each individual term $\xi_j^{(n)}$ is a natural-number-valued random variable, and those can certainly be summed. skeptical scientist (talk) 13:28, 19 January 2012 (UTC)

Nbarth (talk · contribs) added Vietnamese, Korean and Chinese as examples of surname extinction. Unless a source is provided, I will remove them.

I do not think these Sinitic names underwent surname extinction. A Sinitic surname is a name for a whole lineage and extremely rately goes extinct (Japanese surnames are totally different as I already described at Talk:Japanese name#Surname extinction?). In Korea and Vietname, a limited number of ruling class clans first adopted Sinitic surnames, and as a result, a small number of surnames came into use. Later commoners followed them chosing surnames from the small name pool. That's why they have few surnames. --Nanshu (talk) 05:28, 26 November 2011 (UTC)

Good point about difference in number of family names not being primarily due to extinction, but rather to different processes of creation (and to adoption due to other reasons, as in Nguyễn) – thanks!
I’ve re-written the section in this edit to fix this; it’s much more complex than the simplistic “these are old, hence few, these are new, hence many” caricature there was previously.
I did add the Chinese and Korean examples (in this edit), but I didn’t actually add Vietnamese (someone else did). I also added the modern (Dutch, Japanese, Thai) counter-examples.
The example of Chinese names is very well-studied, and definitely has experienced significant surname extinction, from close to 12,000 recorded surnames in the past to about 3,100 now (a factor of about 4:1 or about 75%), as these references state:
The main authority on this (or at least most-quoted in English) seems to be Du Ruofu, who seems a noted Chinese researcher (Chinese Academy of Sciences); I learned about this from the 1995 Economist article.
Chinese is the classic example of this, and it is frequently contrasted with Japanese (as the first paper does), so I think it appropriate to include here, but, as you note at Japanese names, it’s more due to creative Japanese naming, rather than the recent history.
I don’t know if Korean and Vietnamese names have undergone significant name extinction, but clearly the original small number and other effects are more significant factors, as you note.
I’ve also re-written the Japanese and Korean pages to reflect this.
The issue of the huge diversity of number of family names between countries is clearly of interest (100 Vietnamese names vs. 100,000+ Japanese names) and belongs somewhere, but perhaps it’s better placed at Surname or Family name than here, since it’s not primarily due to this process? (This process seems the main mathematical theory of name frequency/distribution, hence of related interest, but the facts of frequency are separate.)
Thanks again – reading up on this and tracking down papers took some time, but it’s very informative and a much more nuanced picture than the simplistic (and incorrect) explanation before.
Please feel free to make further suggestions or changes as you see fit – in particular, perhaps the extreme examples of name frequency (v. few, v. many) would be better placed somewhere else, and perhaps reworded (if not on this page)? (The US is another example, with over 150,000 family names, AFAICT, where this reflects multiethnic origins.)
—Nils von Barth (nbarth) (talk) 12:11, 26 November 2011 (UTC)
Thank you for providing references. They are very intriguing. Unfortunately I have no time to take a closer look right now. There is just one point I would like to note. The surname of the first author of the 1992 paper is Du, not Ruofu. This paper must be cited as Du et al., 1992. --Nanshu (talk) 13:39, 28 November 2011 (UTC)
Thanks for the catch – I thought “Ruofu” was a funny Chinese family name (almost always monosyllabic), but the order listed in the reference was inconsistent (Chinese names family first, Western names family last) – fixed!
—Nils von Barth (nbarth) (talk) 04:46, 1 December 2011 (UTC)

I have quickly scanned Du et al., 1992. Its main subject is not surname extinction and it makes no mention of the Galton–Watson process. So I am unsure if the following citation at Chinese surname#Surnames at present is valid:

Of the thousands of surnames which have been identified from historical texts prior to the [[Han Dynasty]], most have either been lost (via the [[Galton–Watson process]] of extinction of family names)<ref>{{Harv|Du et al.|1992}}</ref> or simplified. Historically there are close to 12,000 surnames recorded, of which only about 3,100 are in current use,<ref>{{Harv|Economist|1995}}</ref>

The authors give some possible reasons for the dwindling of surnames (pp.19–22). The no-children assumption is not presented. Personally I doubt that a lineage dies out peacefully. If one lineage died out, it must involve a catastrophe that characterized the end of a dynastic cycle, with which the population reduced to a half, a quarter or even worse (The Three Kingdom period is a well-known example). It is unfortunate that I have never read literature that relates it to surname extinction. --Nanshu (talk) 13:57, 4 December 2011 (UTC)

Thanks again! You’re right, that was lazy linking on my part. I’ve changed the link from Galton–Watson process to instead point to extinction of family names (in this edit).
While surnames certainly have become extinct, you’re right that this is not primarily (or perhaps even significantly?) due to the Galton–Watson process, and the source doesn’t say this (it just says “extinction”), so I’ve removed references to “family lines dying out” from the Chinese surname page (in this edit); if someone finds a reference actually citing this, it could be added back.
I’ve left a brief note on “family lines dying out possibly playing a part” on this page (since this is the Galton–Watson page, showing connection to content), but noting that this is not the main story.
Hope the current statements in the articles look ok; feel free to correct or comment if not!
You’re right that the Du et. al. 1992 paper is not a great reference; it does cover surname extinction briefly, and the content seems to be “convention wisdom” on Chinese surname extinction (which is what I’m using it for – some brief general remarks), but proper focused references are really necessary. I’m also completely unfamiliar with this literature; hopefully someday an expert will be of help. If you find the time to read up on it, please improve it!
—Nils von Barth (nbarth) (talk) 07:40, 1 January 2012 (UTC)

## edit on assumption

I modified the sentence "Assume, as was taken for granted in Galton's time, that surnames are passed on to all male children by their father".

Indeed it is quite absurd to suggest that this assumption was ever taken for granted. Children out of wedlock have always been a known reality. Moreover, even within the traditional family paradigm this assumption was not at all the standard in all cultures, so that it is not a question of "time" but also of place. — Preceding unsigned comment added by 128.178.14.162 (talk) 12:48, 20 June 2014 (UTC)