Talk:Question answering

From Wikipedia, the free encyclopedia
Jump to: navigation, search
WikiProject Computing (Rated Start-class)
WikiProject icon This article is within the scope of WikiProject Computing, a collaborative effort to improve the coverage of computers, computing, and information technology on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
Start-Class article Start  This article has been rated as Start-Class on the project's quality scale.
 ???  This article has not yet received a rating on the project's importance scale.

Removed unreferenced statements from "Architecture" section[edit]

I removed the following statements from the article, for lacking citations, per WP:VER. Please do not put them back until they are properly referenced. Thank you. The Transhumanist 09:48, 23 March 2017 (UTC)

Most modern QA systems use natural language text documents as their underlying knowledge source.

Natural language processing techniques are used to both process the question and index or process the text corpus from which answers are extracted.

An increasing number of QA systems use the World Wide Web as their corpus of text and knowledge; however, many of these tools do not produce a human-like answer, but rather employ "shallow" methods (keyword-based techniques, templates, etc.) to produce a list of documents or a list of document excerpts containing the probable answer highlighted.

In an alternative QA implementation, human users assemble knowledge in a structured database, called a knowledge base, similar to those employed in the expert systems of the 1970s.

It is also possible to employ a combination of structured databases and natural language text documents in a hybrid QA system.

Such a hybrid system may employ data mining algorithms to populate a structured knowledge base that is also populated and edited by human contributors.

An example hybrid QA system is the Wolfram Alpha QA system which employs natural language processing to transform human questions into a form that is processed by a curated knowledge base.

After the question is analysed, the system typically uses several modules that apply increasingly complex NLP techniques on a gradually reduced amount of text; thus, a document retrieval module uses search engines to identify the documents or paragraphs in the document set that are likely to contain the answer, and a filter preselects small text fragments that contain strings of the same type as the expected answer.

For example, if the question is "Who invented penicillin?", the filter returns text that contain names of people. Finally, an answer extraction module looks for further clues in the text to determine if the answer candidate can indeed answer the question.