Talk:Document retrieval

From Wikipedia, the free encyclopedia
Jump to: navigation, search
WikiProject Computing  
WikiProject icon This article is within the scope of WikiProject Computing, a collaborative effort to improve the coverage of computers, computing, and information technology on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
 ???  This article has not yet received a rating on the project's quality scale.
 ???  This article has not yet received a rating on the project's importance scale.
 

Merging with other articles[edit]


Merge Work[edit]

TODO: shorten redirects (what links to text retrieval).

Here is the content from Document retrieval that I will try and do my best to integrate.


Text retrieval is a branch of computerised information retrieval where the information is stored primarily in the form of text, and the user could retrieve any documents to which given keywords had been attached. Both indexing and searching were relatively skilled occupations.

The advent of full text searching made the job of the indexer redundant during the 1980s. Text databases moved from being large and centralised to local and personal, thanks to the personal computer and the CD-ROM.

Text retrieval is a critical area of study today, since it is the fundamental basis of all internet search engines.

Example: PubMed[edit]

The PubMed form interface features the "related articles" search which works through a comparison of words from the documents' title, abstract, and MeSH terms using a word-weighted algorithm. The details of this algorithm are explicated here [1].

See also[edit]

External links[edit]

Relationship to human indexing[edit]

The opening paragraph included "The advent of full text searching made the job of the indexer redundant during the 1980s" This is simply wrong, with a full explanation of why shown here http://www.jalamb.com/Full_text_searches.html —Preceding unsigned comment added by Proindexer (talkcontribs) 10:57, 16 May 2009 (UTC)

Merged article[edit]

This included

  • adding category "Substring indices" from the original article
  • adding sections "Form based", "Content based", "Further reading" here, to accomodate
  • minimal alterations to the original article text

CpiralCpiral 20:27, 4 September 2013 (UTC)