Jump to content

ROUGE (metric)

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by Nikhilweee (talk | contribs) at 18:22, 4 July 2017 (Removed dead link to berouge.com). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

ROUGE, or Recall-Oriented Understudy for Gisting Evaluation,[1] is a set of metrics and a software package used for evaluating automatic summarization and machine translation software in natural language processing. The metrics compare an automatically produced summary or translation against a reference or a set of references (human-produced) summary or translation.

Metrics

The following five evaluation metrics[2] are available.

  • ROUGE-N: N-gram[3] based co-occurrence statistics.
  • ROUGE-L: Longest Common Subsequence (LCS)[4] based statistics. Longest common subsequence problem takes into account sentence level structure similarity naturally and identifies longest co-occurring in sequence n-grams automatically.
  • ROUGE-W: Weighted LCS-based statistics that favors consecutive LCSes .
  • ROUGE-S: Skip-bigram[5] based co-occurrence statistics. Skip-bigram is any pair of words in their sentence order.
  • ROUGE-SU: Skip-bigram plus unigram-based co-occurrence statistics.

ROUGE can be downloaded from https://github.com/RxNLP/ROUGE-2.0/tree/master/distribute/downloads.

See also

References