Jump to content

NIST (metric)

From Wikipedia, the free encyclopedia

NIST is a method for evaluating the quality of text which has been translated using machine translation. Its name comes from the US National Institute of Standards and Technology.

It is based on the BLEU metric, but with some alterations. Where BLEU simply calculates n-gram precision adding equal weight to each one, NIST also calculates how informative a particular n-gram is. That is to say when a correct n-gram is found, the rarer that n-gram is, the more weight it will be given.[1]

For example, if the bigram "on the" is correctly matched, it will receive lower weight than the correct matching of bigram "interesting calculations", as this is less likely to occur.

NIST also differs from BLEU in its calculation of the brevity penalty insofar as small variations in translation length do not impact the overall score as much.

See also[edit]


  1. ^ "Automatic Evaluation of Machine Translation Quality Using N-gram Co-Occurrence Statistics" (PDF). Retrieved 2010-04-17.

NIST 2005 Machine Translation Evaluation Official Results