This article needs additional citations for verification. (September 2014) (Learn how and when to remove this template message)
The lexical diversity, or type-token ratio (TTR), of a given text is defined as the ratio of different unique word stems (types) to the total number of words (tokens). The term is used in applied linguistics and related fields as an equivalent to lexical richness.
A problem with the lexical diversity measure is that text samples containing large number of tokens give lower values for TTR since it is often necessary for the writer or speaker to re-use several function words. One consequence of this is that lexical diversity is better used for comparing texts of equal length.
In a 2013 article Scott Jarvis proposed that lexical diversity, similar to diversity in ecology, is a perceptual phenomenon. Lexical redundancy is a positive counterpart of lexical diversity in the same way as lexical variability is the mirror image of repetition. According to Jarvis's model, lexical diversity includes variability, volume, evenness, rarity, dispersion and disparity.
According to Jarvis, the six properties of lexical diversity should be measured by the following indices.
|Variability||Measure of Textual Lexical Diversity (MTLD)|
|Volume||Total number of words in the text|
|Evenness||Standard deviation of tokens per type|
|Rarity||Mean BNC rank|
|Dispersion||Mean distance between tokens of type|
|Disparity||Mean number of words per sense or Latent Semantic Analysis|
|This linguistics article is a stub. You can help Wikipedia by expanding it.|