In corpus linguistics a key word is a word which occurs in a text more often than we would expect to occur by chance alone. Key words are calculated by carrying out a statistical test (e.g., loglinear or chi-squared) which compares the word frequencies in a text against their expected frequencies derived in a much larger corpus, which acts as a reference for general language use. Keyness is then the quality a word or phrase has of being "key" in its context.
Compare this with collocation, the quality linking two words or phrases usually assumed to be within a given span of each other. Keyness is a textual feature, not a language feature (so a word has keyness in a certain textual context but may well not have keyness in other contexts, whereas a node and collocate are often found together in texts of the same genre so collocation is to a considerable extent a language phenomenon). The set of keywords found in a given text share keyness, they are co-key. Words typically found in the same texts as a key word are called associates.
- Scott, M. & Tribble, C., 2006, Textual Patterns: keyword and corpus analysis in language education, Amsterdam: Benjamins, 55.
- Scott, M. & Tribble, C., 2006, Textual Patterns: keyword and corpus analysis in language education, Amsterdam: Benjamins, especially chapters 4 & 5.
- Understanding the role of text length, sample size and vocabulary size in determining text coverage, by Kiyomi Chujo and Masao Utiyama
- Frequency Level Checker
|This linguistics article is a stub. You can help Wikipedia by expanding it.|