Trigram

From Wikipedia, the free encyclopedia
Jump to: navigation, search
A trigram may also refer to Ba gua, a philosophical concept in ancient China. It may also refer to a three-letter acronym.

Trigrams are a special case of the N-gram, where N is 3. They are often used in natural language processing for doing statistical analysis of texts.

Frequency[edit]

The 16 most common character-level trigrams in English are:[1]

Rank Trigram
1 the
2 and
3 tha
4 ent
5 ing
6 ion
7 tio
8 for
9 nde
10 has
11 nce
12 edt
13 tis
14 oft
15 sth
16 men

Examples[edit]

The sentence "the quick red fox jumps over the lazy brown dog" has the following word level trigrams:

the quick red
quick red fox
red fox jumps
fox jumps over
jumps over the
over the lazy
the lazy brown
lazy brown dog

And the word-level trigram "the quick red" has the following character-level trigrams (where an underscore "_" marks a space):

the he_ e_q
_qu qui uic
ick ck_ k_r
_re red
  

References[edit]

  1. ^ Lewand, Robert (2000). Cryptological Mathematics. The Mathematical Association of America. p. 36. ISBN 978-0-88385-719-9.  Table also available from [1]