Information retrieval query language

From Wikipedia, the free encyclopedia
Jump to navigation Jump to search

An information retrieval (IR) query language is a query language used to make queries into search index. A query language is formally defined in a context-free grammar (CFG) and can be used by users in a textual, visual/UI or speech form. Advanced query languages are often defined for professional users in vertical search engines, so they get more control over the formulation of queries.

Types of Query Languages[edit]

  • Full-text. A simplest query language is treating all terms as bag of words that are to be matched with the postings in the inverted index and where subsequently ranking models are applied to retrieve the most relevant documents. Only tokens are defined in the CFG. Web search engines often use this approach.
  • Boolean. A query language that also supports the use of the Boolean operators AND, OR, NOT.
  • Structured. A language that supports searching within (a combination of) fields when a document is structured and has been indexed using its document structure.
  • Natural language. A query language that supports natural language by parsing the natural language query to a form that can be best used to retrieve relevant documents, for example with Question Answering systems or conversational search.

Note that IR query languages can be a mix of the different types. Special wild card operators and special search functions for case-sensitive or phrase searches can be defined as part of a query language.

Examples[edit]

An example of an IR query language is contextual query language (CQL), a formal language for representing queries to information retrieval systems such as web indexes, bibliographic catalogs and museum collection information.


See also[edit]

External links[edit]