Lexical density

From Wikipedia, the free encyclopedia
Jump to: navigation, search

In computational linguistics, lexical density constitutes the estimated measure of content per functional (grammatical) and lexical units (lexemes) in total. It is used in discourse analysis as a descriptive parameter which varies with register and genre. Spoken texts tend to have a lower lexical density than written ones, for example.

Lexical density may be determined thus:

 L_d = (N_{\mathrm{lex}}/N)  \times  100

Where:

L_d = the analysed text's lexical density

N_{\mathrm{lex}} = the number of lexical word tokens (nouns, adjectives, verbs, adverbs) in the analysed text

N = the number of all tokens (total number of words) in the analysed text

(The variable symbols applied herein are by no means conventional; they were arbitrarily chosen for the nonce to illustrate the example in question.)

See also[edit]

Further reading[edit]

  • Ure, J (1971). Lexical density and register differentiation. In G. Perren and J.L.M. Trim (eds), Applications of Linguistics, London: Cambridge University Press. 443-452.

External links[edit]