Question answering

From Wikipedia, the free encyclopedia
Jump to navigation Jump to search

Question answering (QA) is a computer science discipline within the fields of information retrieval and natural language processing (NLP), which is concerned with building systems that automatically answer questions posed by humans in a natural language.[1]


A question answering implementation, usually a computer program, may construct its answers by querying a structured database of knowledge or information, usually a knowledge base. More commonly, question answering systems can pull answers from an unstructured collection of natural language documents.

Some examples of natural language document collections used for question answering systems include:

  • a local collection of reference texts
  • internal organization documents and web pages
  • compiled newswire reports
  • a set of Wikipedia pages
  • a subset of World Wide Web pages

Question answering research attempts to deal with a wide range of question types including: fact, list, definition, How, Why, hypothetical, semantically constrained, and cross-lingual questions.

  • Closed-domain question answering deals with questions under a specific domain (for example, medicine or automotive maintenance), and can exploit domain-specific knowledge frequently formalized in ontologies. Alternatively, closed-domain might refer to a situation where only a limited type of questions are accepted, such as questions asking for descriptive rather than procedural information. Question answering systems in the context of machine reading applications have also been constructed in the medical domain, for instance related to Alzheimer's disease.[2]
  • Open-domain question answering deals with questions about nearly anything, and can only rely on general ontologies and world knowledge. On the other hand, these systems usually have much more data available from which to extract the answer.


Two early question answering systems were BASEBALL[3] and LUNAR.[4] BASEBALL answered questions about Major League Baseball league over a period of one year. LUNAR, in turn, answered questions about the geological analysis of rocks returned by the Apollo moon missions. Both question answering systems were very effective in their chosen domains. In fact, LUNAR was demonstrated at a lunar science convention in 1971 and it was able to answer 90% of the questions in its domain posed by people untrained on the system. Further restricted-domain question answering systems were developed in the following years. The common feature of all these systems is that they had a core database or knowledge system that was hand-written by experts of the chosen domain. The language abilities of BASEBALL and LUNAR used techniques similar to ELIZA and DOCTOR, the first chatterbot programs.

SHRDLU was a highly successful question-answering program developed by Terry Winograd in the late 1960s and early 1970s. It simulated the operation of a robot in a toy world (the "blocks world"), and it offered the possibility of asking the robot questions about the state of the world. Again, the strength of this system was the choice of a very specific domain and a very simple world with rules of physics that were easy to encode in a computer program.

In the 1970s, knowledge bases were developed that targeted narrower domains of knowledge. The question answering systems developed to interface with these expert systems produced more repeatable and valid responses to questions within an area of knowledge. These expert systems closely resembled modern question answering systems except in their internal architecture. Expert systems rely heavily on expert-constructed and organized knowledge bases, whereas many modern question answering systems rely on statistical processing of a large, unstructured, natural language text corpus.

The 1970s and 1980s saw the development of comprehensive theories in computational linguistics, which led to the development of ambitious projects in text comprehension and question answering. One example of such a system was the Unix Consultant (UC), developed by Robert Wilensky at U.C. Berkeley in the late 1980s. The system answered questions pertaining to the Unix operating system. It had a comprehensive hand-crafted knowledge base of its domain, and it aimed at phrasing the answer to accommodate various types of users. Another project was LILOG, a text-understanding system that operated on the domain of tourism information in a German city. The systems developed in the UC and LILOG projects never went past the stage of simple demonstrations, but they helped the development of theories on computational linguistics and reasoning.

Specialized natural language question answering systems have been developed, such as EAGLi for health and life scientists.


As of 2001, question answering systems typically included a question classifier module that determines the type of question and the type of answer.[5]

Question answering methods[edit]

Question answering is very dependent on a good search corpus—for without documents containing the answer, there is little any question answering system can do. It thus makes sense that larger collection sizes generally lend well to better question answering performance, unless the question domain is orthogonal to the collection. The notion of data redundancy in massive collections, such as the web, means that nuggets of information are likely to be phrased in many different ways in differing contexts and documents,[6] leading to two benefits:

  1. By having the right information appear in many forms, the burden on the question answering system to perform complex NLP techniques to understand the text is lessened.
  2. Correct answers can be filtered from false positives by relying on the correct answer to appear more times in the documents than instances of incorrect ones.

Some question answering systems rely heavily on automated reasoning.[7][8]

Open domain question answering[edit]

In information retrieval, an open domain question answering system aims at returning an answer in response to the user's question. The returned answer is in the form of short texts rather than a list of relevant documents.[9] The system uses a combination of techniques from computational linguistics, information retrieval and knowledge representation for finding answers.

The system takes a natural language question as an input rather than a set of keywords, for example, "When is the national day of China?" The sentence is then transformed into a query through its logical form. Having the input in the form of a natural language question makes the system more user-friendly, but harder to implement, as there are various question types and the system will have to identify the correct one in order to give a sensible answer. Assigning a question type to the question is a crucial task, the entire answer extraction process relies on finding the correct question type and hence the correct answer type.

Keyword extraction is the first step for identifying the input question type.[10] In some cases, there are clear words that indicate the question type directly, i.e., "Who", "Where" or "How many", these words tell the system that the answers should be of type "Person", "Location", or "Number", respectively. In the example above, the word "When" indicates that the answer should be of type "Date". POS (part-of-speech) tagging and syntactic parsing techniques can also be used to determine the answer type. In this case, the subject is "Chinese National Day", the predicate is "is" and the adverbial modifier is "when", therefore the answer type is "Date". Unfortunately, some interrogative words like "Which", "What" or "How" do not give clear answer types. Each of these words can represent more than one type. In situations like this, other words in the question need to be considered. First thing to do is to find the words that can indicate the meaning of the question. A lexical dictionary such as WordNet can then be used for understanding the context.

Once the question type has been identified, an information retrieval system is used to find a set of documents containing the correct keywords. A tagger and NP/Verb Group chunker can be used to verify whether the correct entities and relations are mentioned in the found documents. For questions such as "Who" or "Where", a named-entity recogniser is used to find relevant "Person" and "Location" names from the retrieved documents. Only the relevant paragraphs are selected for ranking.

A vector space model can be used as a strategy for classifying the candidate answers. Check if the answer is of the correct type as determined in the question type analysis stage. An inference technique can also be used to validate the candidate answers. A score is then given to each of these candidates according to the number of question words it contains and how close these words are to the candidate, the more and the closer the better. The answer is then translated into a compact and meaningful representation by parsing. In the previous example, the expected output answer is "1st Oct."

Mathematical question answering[edit]

An open source math-aware question answering system based on Ask Platypus and Wikidata was published in 2018.[11] The system takes an English or Hindi natural language question as input and returns a mathematical formula retrieved from Wikidata as succinct answer. The resulting formula is translated into a computable form, allowing the user to insert values for the variables. Names and values of variables and common constants are retrieved from Wikidata if available. It is claimed that the system outperforms a commercial computational mathematical knowledge engine on a test set.

MathQA methods need to combine natural and formula language. One possible approach is to perform supervised annotation via Entity Linking. The "ARQMath Task" at CLEF 2020[12] was launched to address the problem of linking newly posted questions from the platform Math Stack Exchange (MSE) to existing ones that were already answered by the community.[13] The lab was motivated by the fact that Mansouri et al. discovered that 20% of the mathematical queries in general-purpose search engines are expressed as well-formed questions.[14] It contained two separate sub-tasks. Task 1: "Answer retrieval" matching old post answers to newly posed questions and Task 2: "Formula retrieval" matching old post formulae to new questions. Starting with the domain of mathematics, which involves formula language, the goal is to later extend the task to other domains (e.g., STEM disciplines, such as chemistry, biology, etc.), which employ other types of special notation (e.g., chemical formulae).[12][13]


Question answering systems have been extended in recent years to encompass additional domains of knowledge[15] For example, systems have been developed to automatically answer temporal and geospatial questions, questions of definition and terminology, biographical questions, multilingual questions, and questions about the content of audio, images,[16] and video.[17] Current question answering research topics include:

In 2011, Watson, a question answering computer system developed by IBM, competed in two exhibition matches of Jeopardy! against Brad Rutter and Ken Jennings, winning by a significant margin.[25] Facebook Research has made their DrQA system[26] available under an open source license. This system has been used for open domain question answering using Wikipedia as knowledge source.[27]


  1. ^ Philipp Cimiano; Christina Unger; John McCrae (1 March 2014). Ontology-Based Interpretation of Natural Language. Morgan & Claypool Publishers. ISBN 978-1-60845-990-2.
  2. ^ Roser Morante, Martin Krallinger, Alfonso Valencia and Walter Daelemans. Machine Reading of Biomedical Texts about Alzheimer's Disease. CLEF 2012 Evaluation Labs and Workshop. September 17, 2012
  3. ^ GREEN JR, Bert F; et al. (1961). "Baseball: an automatic question-answerer" (PDF). Western Joint IRE-AIEE-ACM Computer Conference: 219–224.
  4. ^ Woods, William A; Kaplan, R. (1977). "Lunar rocks in natural English: Explorations in natural language question answering". Linguistic Structures Processing 5. 5: 521–569.
  5. ^ Hirschman, L. & Gaizauskas, R. (2001) Natural Language Question Answering. The View from Here. Natural Language Engineering (2001), 7:4:275-300 Cambridge University Press.
  6. ^ Lin, J. (2002). The Web as a Resource for Question Answering: Perspectives and Challenges. In Proceedings of the Third International Conference on Language Resources and Evaluation (LREC 2002).
  7. ^ Moldovan, Dan, et al. "Cogex: A logic prover for question answering." Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology-Volume 1. Association for Computational Linguistics, 2003.
  8. ^ Furbach, Ulrich, Ingo Glöckner, and Björn Pelzer. "An application of automated reasoning in natural language question answering." Ai Communications 23.2-3 (2010): 241-265.
  9. ^ Sun, Haitian; Dhingra, Bhuwan; Zaheer, Manzil; Mazaitis, Kathryn; Salakhutdinov, Ruslan; Cohen, William (2018). "Open Domain Question Answering Using Early Fusion of Knowledge Bases and Text". Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Brussels, Belgium. pp. 4231–4242. arXiv:1809.00782. doi:10.18653/v1/D18-1455. S2CID 52154304.
  10. ^ Harabagiu, Sanda; Hickl, Andrew (2006). "Methods for using textual entailment in open-domain question answering". Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the ACL - ACL '06. pp. 905–912. doi:10.3115/1220175.1220289.
  11. ^ Moritz Schubotz; Philipp Scharpf; et al. (12 September 2018). "Introducing MathQA: a Math-Aware question answering system". Information Discovery and Delivery. Emerald Publishing Limited. 46 (4): 214–224. doi:10.1108/IDD-06-2018-0022.
  12. ^ a b Zanibbi, Richard; Oard, Douglas W.; Agarwal, Anurag; Mansouri, Behrooz (2020), "Overview of ARQMath 2020: CLEF Lab on Answer Retrieval for Questions on Math", Lecture Notes in Computer Science, Cham: Springer International Publishing, pp. 169–193, doi:10.1007/978-3-030-58219-7_15, ISBN 978-3-030-58218-0, retrieved 2021-06-09
  13. ^ a b Bela, Scharpf, Philipp Schubotz, Moritz Greiner-Petter, Andre Ostendorff, Malte Teschke, Olaf Gipp (2020-12-04). ARQMath Lab: An Incubator for Semantic Formula Search in zbMATH Open?. OCLC 1228449497.
  14. ^ Mansouri, Behrooz; Zanibbi, Richard; Oard, Douglas W. (June 2019). "Characterizing Searches for Mathematical Concepts". 2019 ACM/IEEE Joint Conference on Digital Libraries (JCDL). IEEE: 57–66. doi:10.1109/jcdl.2019.00019. ISBN 978-1-7281-1547-4. S2CID 198972305.
  15. ^ Paşca, Marius (2005). "Book Review New Directions in Question Answering Mark T. Maybury (editor) (MITRE Corporation) Menlo Park, CA: AAAI Press and Cambridge, MA: The MIT Press, 2004, xi+336 pp; paperbound, ISBN 0-262-63304-3, $40.00, £25.95". Computational Linguistics. 31 (3): 413–417. doi:10.1162/089120105774321055. S2CID 12705839.
  16. ^ a b Anderson, Peter, et al. "Bottom-up and top-down attention for image captioning and visual question answering." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018.
  17. ^ Zhu, Linchao, et al. "Uncovering the temporal context for video question answering." International Journal of Computer Vision 124.3 (2017): 409-421.
  18. ^ Quarteroni, Silvia, and Suresh Manandhar. "Designing an interactive open-domain question answering system." Natural Language Engineering 15.1 (2009): 73-95.
  19. ^ Yih, Wen-tau, Xiaodong He, and Christopher Meek. "Semantic parsing for single-relation question answering." Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). 2014.
  20. ^ Perera, R., Nand, P. and Naeem, A. 2017. Utilizing typed dependency subtree patterns for answer sentence generation in question answering systems.
  21. ^ "BitCrawl by Hobson Lane". Archived from the original on October 27, 2012. Retrieved 2012-05-29.CS1 maint: bot: original URL status unknown (link)
  22. ^ Perera, R. and Perera, U. 2012. Towards a thematic role based target identification model for question answering.
  23. ^ Bahadorreza Ofoghi; John Yearwood & Liping Ma (2008). The impact of semantic class identification and semantic role labeling on natural language answer extraction. The 30th European Conference on Information Retrieval (ECIR'08). Springer Berlin Heidelberg. pp. 430–437. doi:10.1007/978-3-540-78646-7_40.
  24. ^ Bahadorreza Ofoghi; John Yearwood & Liping Ma (2009). "The impact of frame semantic annotation levels, frame‐alignment techniques, and fusion methods on factoid answer processing". Journal of the American Society for Information Science and Technology. 60 (2): 247–263. doi:10.1002/asi.20989.
  25. ^ Markoff, John (2011-02-16). "On 'Jeopardy!' Watson Win is All but Trivial". The New York Times.
  26. ^ "DrQA".
  27. ^ Chen, Danqi; Fisch, Adam; Weston, Jason; Bordes, Antoine (2017). "Reading Wikipedia to Answer Open-Domain Questions". arXiv:1704.00051 [cs.CL].

Further reading[edit]

External links[edit]