Trigram

From Wikipedia, the free encyclopedia
Jump to: navigation, search
A trigram may also refer to Ba gua, a philosophical concept in ancient China. It may also refer to a three-letter acronym.

Trigrams are a special case of the N-gram, where N is 3. They are often used in natural language processing for doing statistical analysis of texts.

[edit] Frequency

The 16 most common character-level trigrams in English are:[1]

Rank Trigram
1 the
2 and
3 tha
4 ent
5 ing
6 ion
7 tio
8 for
9 nde
10 has
11 nce
12 edt
13 tis
14 oft
15 sth
16 men

[edit] Examples

The sentence "the quick red fox jumps over the lazy brown dog" has the following word level trigrams:

the quick red
quick red fox
red fox jumps
fox jumps over
jumps over the
over the lazy
the lazy brown
lazy brown dog

And the word-level trigram "the quick red" has the following character-level trigrams (where an underscore "_" marks a space):

the  qui  k_r
he_  uic  _re
e_q  ick  red
_qu  ck_

[edit] References


Personal tools
Namespaces
Variants
Actions
Navigation
Interaction
Toolbox
Print/export
Languages