# Talk:Heaps' law

WikiProject Linguistics / Applied Linguistics  (Rated Start-class)
This article is within the scope of WikiProject Linguistics, a collaborative effort to improve the coverage of linguistics on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
Start  This article has been rated as Start-Class on the project's quality scale.
???  This article has not yet received a rating on the project's importance scale.
WikiProject Statistics (Rated Start-class, Low-importance)

This article is within the scope of the WikiProject Statistics, a collaborative effort to improve the coverage of statistics on Wikipedia. If you would like to participate, please visit the project page or join the discussion.

Start  This article has been rated as Start-Class on the quality scale.
Low  This article has been rated as Low-importance on the importance scale.

## [Untitled]

The original version of this page was adapted from http://planetmath.org/?method=src&from=objects&id=3431&op=getobj owned by akrowne, with permission under the GFDL

## Divergence

I think with "Where VR is the subset of the vocabulary V represented by the instance text of size n" the author wanted to say "Where VR is the cardinality of the subset of the vocabulary V represented by the instance text of size n", because a subset is not a number. However, the size of the subset diverges (i.e. becomes arbitrarily large) as n goes towards infinity. That would only make sense if the vocabulary would also be of infinite size. (just as a sidenote: I would have expected the fraction of the vocabulary not covered by the text to decrease exponentially when looking at larger and larger documents). Icek (talk) 19:02, 29 September 2009 (UTC)

Vocabulary size is infinite according to generative grammar, see e.g. Mark Aronoff "Word formation in generative Grammar" MIT Press 1985, Andras Kornai "How many words are there?" Glottometrics 2002/4 61-86 88.132.28.96 (talk) 20:33, 17 March 2012 (UTC)