Cross-language information retrieval

From Wikipedia, the free encyclopedia
Jump to: navigation, search

Cross-language information retrieval (CLIR) is a subfield of information retrieval dealing with retrieving information written in a language different from the language of the user's query. For example, a user may pose their query in English but retrieve relevant documents written in French. To do so, most of CLIR systems use translation techniques.[1] CLIR techniques can be classified into different categories based on different translation resources:

  • Dictionary-based CLIR techniques
  • Parallel corpora based CLIR techniques
  • Comparable corpora based CLIR techniques
  • Machine translator based CLIR techniques

The first workshop on CLIR was held in Zürich during the SIGIR-96 conference.[2] Workshops have been held yearly since 2000 at the meetings of the Cross Language Evaluation Forum (CLEF).

The term "cross-language information retrieval" has many synonyms, of which the following are perhaps the most frequent: cross-lingual information retrieval, translingual information retrieval, multilingual information retrieval. The term "multilingual information retrieval" refers to CLIR in general, but it also has a specific meaning of cross-language information retrieval where a document collection is multilingual.

See also[edit]

  • EXCLAIM (EXtensible Cross-Linguistic Automatic Information Machine)


  1. ^ "Versatile question answering systems: seeing in synthesis", Mittal et al., IJIIDS, 5(2), 119-142, 2011.
  2. ^ The proceedings of this workshop can be found in the book Cross-Language Information Retrieval (Grefenstette, ed; Kluwer, 1998) ISBN 0-7923-8122-X.

External links[edit]