Fuzzy matching (computer-assisted translation)

From Wikipedia, the free encyclopedia
Jump to: navigation, search

Fuzzy matching is a technique used in computer-assisted translation as a special case of record linkage. It works with matches that may be less than 100% perfect when finding correspondences between segments of a text and entries in a database of previous translations. It usually operates at sentence-level segments, but some translation technology allows matching at a phrasal level. It is used when the translator is working with translation memory (TM).

Background[edit]

When an exact match cannot be found in the TM database for the text being translated, there is an option to search for a match that is less than exact; the translator sets the threshold of the fuzzy match to a percentage value less than 100%, and the database will then return any matches in its memory corresponding to that percentage. Its primary function is to assist the translator by speeding up the translation process; fuzzy matching is not designed to replace the human translator.

History[edit]

Because of the polymorphous and dynamic nature of language, particularly English (which accounts for 90% of all source texts undergoing translation in the localisation industry[citation needed]), methods are always being sought to make the translation process easier and faster. Since the late 1980s, translation memory tools have been developed to increase productivity and make the whole translation process faster for the translator.

In the 1990s, fuzzy matching began to take off as a prominent feature of TM tools, and despite some issues concerning the extra work involved in editing a fuzzy match "proposal", it is still a popular subset of TM. It is currently a feature of most popular TM tools.

Methodology[edit]

The TM tool searches the database to locate segments that are an approximate match for a segment in a new source text to be translated. The TM, in effect, "proposes" the match to the translator; it is then up to the translator to accept this proposal or to edit this proposal to more fully equate with the new source text that is undergoing translation. In this way, fuzzy matching can speed up the translation process and lead to increased productivity.

This raises questions about the quality of the resulting translations. On occasions a translator is under pressure to deliver on time and is thus led to accept a fuzzy match proposal without checking its suitability and context. TM databases are built up by input from numerous different translators working on a variety of different texts, with a danger that sentences extracted from this word "tapestry" will be a stitched-together hodgepodge of styles, and the antithesis of the striven-after consistency – what some critics have dubbed "sentence salad". The question of faith in the TM's proposals can be a problem when trying to strike a balance between a faster translation process and the quality of that translation. Nevertheless, fuzzy matching is still an important part of the translator's tool-kit.

External links[edit]