NER model

From Wikipedia, the free encyclopedia

The NER model is one of a number of methods for determining the accuracy of live subtitles in television broadcasts and events that are produced using speech recognition. The three letters stand for number, edition error and recognition error. It is an alternative to the WER model (Word Error Rate) used in several countries.

The model contains a formula to determine the quality of live subtitles: a NER value of 100 indicates that the content was subtitled entirely correctly. This overall score is calculated as follows: Firstly, the number of edit and recognition errors is deducted from the total number of words in the live subtitles. This number is then divided by the total number of words in the live subtitles and finally multiplied by one hundred.


The acronyms stand for the following:

  • N (number) = total number of words in the live subtitles
  • E (Edition error) = edition error
  • R (Recognition error) = recognition error

This measurement process is already used for public television broadcasts in several European countries like Italy and Switzerland. Other countries and authorities like British Ofcom have already expressed an interest.

By way of contrast, the WER model is static, since it simply measures the textual discrepancy between that which was written and spoken.

See also[edit]


  • Pablo Romero-Fresco: Subtitling through Speech Recognition: Respeaking. Manchester: St. Jerome 2011, ISBN 9781905763283

External links[edit]

  • [1], presentation of the NER model at the International Telecommunication Union - Geneva, November 25, 2011
  • [2], British Ofcom - The quality of live subtitling, May 2013
  • [3], NER – The Concept, Types, and Applications