Wikipedia:Wikipedia Signpost/2018-08-30/Recent research
The Battle for Wikipedia
- Reviewed by Bri
The "clean Wehrmacht" battle covered in the past three issues of The Signpost (May, June, July) is reviewed from a historian's perspective in The Journal of Slavic Military Studies. The title of the paper is an allusion to Lost Victories, today generally accepted as an unreliable and apologetic account of the actions of German forces during World War II. The author, David Stahel, who states that he is not a Wikipedia editor, examines the behind-the-scenes mechanisms and debates that result in article content, with the observation that these debates are not consistent with "consensus among serious historians" and "many people (and in my experience students) invest [Wikipedia] with a degree of objectivity and trust that, at least on topics related to the Wehrmacht, can at times be grossly misplaced...articles on the Wehrmacht (in English Wikipedia) might struggle to meet [the standard]". The author describes questionable arguments raised by several of the pro-Wehrmacht editors and concludes their writing "may in some instances reflect extremist views or romantic notions not grounded in the historiography".
Readers prefer summaries written by a neural network over those by Wikipedians 40% of the time – but it still suffers from hallucinations
- Reviewed by Tilman Bayer
Several recent publications tackle the problem of taking machine-readable factual statements about a notable person, such as their date of birth from the Wikidata item about them, and creating a biographical summary in natural language.
A paper by three researchers from Australia reports on using an artificial intelligence approach for "the generation of one-sentence Wikipedia biographies from facts derived from Wikidata slot-value pairs". These are modeled after the first sentences of biographical Wikipedia articles, which, the authors argue, are of particular value because they form "clear and concise biographical summaries". The task of generating them involves making decisions about which of the facts to include (e.g. the date of birth or a political party that the subject is a member of), and arranging them into a natural language sentence. To achieve this in an automated fashion, the authors trained a recurrent neural network (RNN) implemented in TensorFlow on a corpus of several hundred thousand introductory sentences extracted from English Wikipedia articles about human, together the corresponding Wikidata entries. (Although not mentioned in the paper, such first sentences are the subject of a community guideline on the English Wikipedia, at least some aspects of which one might expect the neural network to reconstruct from the corpus.)
An example the algorithm's output compared to the Wikipedia original (excerpted from Table 5 in the paper):
|Wikipedia original||robert charles cortner ( april 16 , 1927 may 19 , 1959 ) was an american automobile racing driver from redlands , california .|
|Algorithm variant "S2S"||bob cortner ( april 16 , 1927 |
|Algorithm variant "S2S+AE"||robert cortner ( april 16 , 1927 may 19 , 1959 ) was an american race-car driver .|
The quality of the algorithm's output (in several variants) was evaluated against the actual human-written sentences from Wikipedia (as the "gold standard") with a standard automated test (BLEU), but also by human readers recruited from CrowdFlower. This "human preference evaluation suggests the model is nearly as good as the Wikipedia reference", with the consensus of the human raters even preferring the neural network's version 40% of the time. However, those of the algorithm's variants that are allowed to infer facts not directly stated in the Wikidata item can suffer from the problem of AI "hallucinations", e.g. the struck-out parts in the above example, claiming that Bob Cortner was a boxer instead of a race-car driver, and died in 2005 instead of 1959.
Apart from describing and evaluating the algorithm, the paper also provides some results about Wikipedia itself, e.g. showing which biographical facts are most frequently used by Wikipedia editors. Table 1 from the paper lists "the top fifteen slots across entities used for input, and the % of time the value is a substring in the entity’s first sentence" in the examined corpus:
|SEX OR GENDER||1,007,575||0|
|DATE OF BIRTH||817,942||88|
|DATE OF DEATH||346,168||86|
|PLACE OF BIRTH||298,374||25|
|PLACE OF DEATH||107,188||17|
The paper's literature review mentions a 2016 paper titled "Neural Text Generation from Structured Data with Application to the Biography Domain" as "the closest work to ours with a similar task using Wikipedia infoboxes in place of Wikidata. They condition an attentional neural language model (NLM) on local and global properties of infobox tables [...] They use 723k sentences from Wikipedia articles with 403k lower-cased words mapping to 1,740 distinct facts".
While the authors of both papers commendably make at least some of their code and data available on GitHub (1, 2), they do not seem to have aimed to make their algorithms into a tool for generating text for use in Wikipedia itself – perhaps wisely so, as previous efforts in this direction have met with community opposition due to quality concerns (e.g. in the case of a paper we covered previously here: "Bot detects theatre play scripts on the web and writes Wikipedia articles about them").
In the third, most recent research effort, covered in several publications, another group of researchers likewise developed a method to automatically generate summaries of Wikipedia article topics via a neural network, based on structured data from Wikidata (and, in one variant, DBpedia).
They directly worked with community members from two small Wikipedias (Arabic and Esperanto) to evaluate "not only the quality of the generated text, but also the usefulness of our end-system to any underserved Wikipedia version", when extending the existing ArticlePlaceholder feature that is in use on some of these smaller Wikipedias. The result was that "that members of the targeted language communities rank our text close to the expected quality standards of Wikipedia, and are likely to consider the generated text as part of Wikipedia. Lastly, we found that the editors are likely to reuse a large portion of the generated summaries [when writing actual Wikipedia articles], thus emphasizing the usefulness of our approach to its intended audience."
Other recent publications
Other recent publications that could not be covered in time for this issue include the items listed below. Contributions are always welcome for reviewing or summarizing newly published research.
- Compiled by Tilman Bayer
The Wikipedia Adventure: Beloved but ineffective
- "The Wikipedia Adventure: Field Evaluation of an Interactive Tutorial for New Users"
From the accompanying blog post: "The system was a gamified tutorial for new Wikipedia editors. Working with the tutorial creators, we conducted both a survey of its users and a randomized field experiment testing its effectiveness in encouraging subsequent contributions. We found that although users loved it, it did not affect subsequent participation rates."
Told you so: Hindsight bias in Wikipedia articles about events
Two papers by the same team of researchers explore this topic for Wikipedia editors and readers, respectively:
- "Biases in the production and reception of collective knowledge: the case of hindsight bias in Wikipedia"
From the paper:
Study 1: This study investigated whether events in Wikipedia articles are represented as more likely in retrospect. For a total of 33 events, we retrieved article versions from the German Wikipedia that existed prior to the event (foresight) or after the event had happened (hindsight) and assessed indicators of hindsight bias in those articles [...] we determined the number of words of the categories "cause" (containing words such as "hence"), "certainty" (e.g., "always"), tentativeness (e.g., "maybe"), "insight" (e.g., "consider"), and "discrepancy" (e.g., "should"), because the hindsight perspective is assumed to be the result of successful causal modeling [...] There was an increase in the proportion of hindsight-related words across article versions. [...] We investigated whether there is evidence for hindsight distortions in Wikipedia articles or whether Wikipedia’s guidelines effectively prevent hindsight bias to occur. Our study provides empirical evidence for both.
- "Cultural Interpretations of Global Information? Hindsight Bias after Reading Wikipedia Articles across Cultures"
From the abstract: "We report two studies with Wikipedia articles and samples from different cultures (Study 1: Germany, Singapore, USA, Vietnam, Japan, Sweden, N = 446; Study 2: USA, Vietnam, N = 144). Participants read one of two article versions (foresight and hindsight) about the Fukushima Nuclear Plant and estimated the likelihood, inevitability, and foreseeability of the nuclear disaster. Reading the hindsight article increased individuals' hindsight bias independently of analytic or holistic thinking style. "
"WikiPassageQA: A Benchmark Collection for Research on Non-factoid Answer Passage Retrieval"
From the abstract: "...we introduce a new Wikipedia based collection specific for non-factoid answer passage retrieval containing thousands of questions with annotated answers and show benchmark results on a variety of state of the art neural architectures and retrieval models."
"Analysis of Wikipedia-based Corpora for Question Answering"
From the abstract: "This paper gives comprehensive analyses of corpora based on Wikipedia for several tasks in question answering. Four recent corpora are collected, WikiQA, SelQA, SQuAD, and InfoQA, and first analyzed intrinsically by contextual similarities, question types, and answer categories. These corpora are then analyzed extrinsically by three question answering tasks, answer retrieval, selection, and triggering."
"Harvesting Paragraph-Level Question-Answer Pairs from Wikipedia"
From the abstract: "We study the task of generating from Wikipedia articles question-answer pairs that cover content beyond a single sentence. We propose a neural network approach that incorporates coreference knowledge via a novel gating mechanism. [...] We apply our system [...] to the 10,000 top-ranking Wikipedia articles and create a corpus of over one million question-answer pairs."
Asking Wikidata questions in natural language
From the abstract: "We first introduce a new approach for translating natural language questions to SPARQL queries. It is able to query several KBs [knowledge bases] simultaneously, in different languages, and can easily be ported to other KBs and languages. In our evaluation, the impact of our approach is proven using 5 different well-known and large KBs: Wikidata, DBpedia, MusicBrainz, DBLP and Freebase as well as 5 different languages namely English, German, French, Italian and Spanish." Online demo: https://wdaqua-frontend.univ-st-etienne.fr/
- Stahel, David (18 July 2018). "The Battle for Wikipedia: The New Age of 'Lost Victories'?" (PDF). Historical. The Journal of Slavic Military Studies. Routledge. 31 (3): 396–402. doi:10.1080/13518046.2018.1487198. eISSN 1556-3006. ISSN 1351-8046. OCLC 7781539362. Wikidata 55972890. Retrieved 28 August 2018.
- Chisholm, Andrew; Radford, Will; Hachey, Ben (3–7 April 2017). "Learning to generate one-sentence biographies from Wikidata" (PDF). In Lapata, Mirella; Blunsom, Phil; Koller, Alexander (eds.). Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers. 15th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2017). Valencia, Spain: Association for Computational Linguistics. pp. 633–642. arXiv:1702.06235v1. doi:10.18653/v1/E17-1060. ISBN 978-1-945626-34-0. ACL Anthology E17-1060. Wikidata 28819478. Archived from the original on 29 August 2018. Retrieved 28 August 2018.
- Lebret, Rémi; Grangier, David; Auli, Michael (1–5 November 2016). "Neural Text Generation from Structured Data with Application to the Biography Domain" (PDF). In Su, Jian; Duh, Kevin; Carreras, Xavier (eds.). Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing (EMNLP 2016). Austin, Texas: Association for Computational Linguistics. pp. 1203–1213. doi:10.18653/v1/D16-1128. ISBN 978-1-945626-25-8. ACL Anthology D16-1128. Archived (PDF) from the original on 29 August 2018. Retrieved 29 August 2018.
- Kaffee, Lucie-Aimée; Elsahar, Hady; Vougiouklis, Pavlos; Gravier, Christophe; Laforest, Frédérique; Hare, Jonathon; Simperl, Elena (14 February 2018). "Mind the (Language) Gap: Generation of Multilingual Wikipedia Summaries from Wikidata for ArticlePlaceholders". In Gangemi, Aldo; Navigli, Roberto; Vidal, María-Esther; Hitzler, Pascal; Troncy, Raphaël; Hollink, Laura; Alam, Mehwish (eds.). The Semantic Web: 15th International Conference, ESWC 2018, Heraklion, Crete, Greece, June 3–7, 2018, Proceedings. Extended Semantic Web Conference (ESWC 2018) (Preprint). Lecture Notes in Computer Science. 11 (Online ed.). Cham, Switzerland: Springer Science+Business Media (published 3 June 2018). pp. 319–334. doi:10.1007/978-3-319-93417-4_21. eISSN 1611-3349. ISBN 978-3-319-93417-4. ISSN 0302-9743. LCCN 2018946633. OCLC 7667759818. LNCS 10843. Wikidata 50290303. Archived from the original on 29 August 2018. Retrieved 29 August 2018 – via Silvio Peroni.
- Kaffee, Lucie-Aimée; Elsahar, Hady; Vougiouklis, Pavlos; Gravier, Christophe; Laforest, Frédérique; Hare, Jonathon; Simperl, Elena (1–6 June 2018). "Learning to Generate Wikipedia Summaries for Underserved Languages from Wikidata". In Walker, Marilyn; Ji, Heng; Stent, Amanda (eds.). Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers). 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL HLT 2018). New Orleans, Louisiana: Association for Computational Linguistics. pp. 640–645. arXiv:1803.07116v2. doi:10.18653/v1/N18-2101. ISBN 978-1-948087-29-2. OCLC 7667759818. ACL Anthology N18-2101. Wikidata 50827579. RG 323905026.
- Vougiouklis, Pavlos; Elsahar, Hady; Kaffee, Lucie-Aimée; Gravier, Christophe; Laforest, Frédérique; Hare, Jonathon; Simperl, Elena (30 July 2018). "Neural Wikipedian: Generating Textual Summaries from Knowledge Base Triples" (PDF). Journal of Web Semantics. Elsevier. arXiv:1711.00155. doi:10.1016/j.websem.2018.07.002. eISSN 1873-7749. ISSN 1570-8268. OCLC 7794877956. Wikidata 45322945. Archived (PDF) from the original on 29 August 2018. Retrieved 29 August 2018.
- Narayan, Sneha; Orlowitz, Jake; Morgan, Jonathan; Hill, Benjamin Mako; Shaw, Aaron (25 February – 1 March 2017). "The Wikipedia Adventure: Field Evaluation of an Interactive Tutorial for New Users" (PDF). Proceedings of the 2017 ACM Conference on Computer Supported Cooperative Work and Social Computing. 2017 ACM Conference on Computer Supported Cooperative Work and Social Computing (CSCW '17). New York, NY: Association for Computing Machinery. pp. 1785–1799. doi:10.1145/2998181.2998307. ISBN 978-1-4503-4335-0. Wikidata 37816091. Archived (PDF) from the original on 29 August 2018. Retrieved 29 August 2018.
- Oeberst, Aileen; Beck, Ina von der; Back, Mitja D.; Cress, Ulrike; Nestler, Steffen (17 April 2017). "Biases in the production and reception of collective knowledge: the case of hindsight bias in Wikipedia" (DOC). Psychological Research (Preprint). Berlin; Heidelberg: Springer Berlin Heidelberg: 1–17. doi:10.1007/s00426-017-0865-7. eISSN 1430-2772. ISSN 0340-0727. OCLC 7016703631. PMID 28417198. Wikidata 29647478. Archived from the original on 29 August 2018. Retrieved 29 August 2018 – via ResearchGate.
- Beck, Ina von der; Oeberst, Aileen; Cress, Ulrike; Nestler, Steffen (22 May 2017). "Cultural Interpretations of Global Information? Hindsight Bias after Reading Wikipedia Articles across Cultures". Applied Cognitive Psychology. John Wiley & Sons. 31 (3): 315–325. doi:10.1002/acp.3329. eISSN 1099-0720. ISSN 0888-4080. OCLC 7065844160. Wikidata 30062753.
- Cohen, Daniel; Yang, Liu; Croft, W. Bruce (8–12 July 2018). "WikiPassageQA: A Benchmark Collection for Research on Non-factoid Answer Passage Retrieval". SIGIR #41 Proceedings. 41st International ACM SIGIR Conference on Research & Development in Information Retrieval (SIGIR '18). New York, NY: Association for Computing Machinery (published 27 June 2018). pp. 1165–1168. arXiv:1805.03797v1. doi:10.1145/3209978.3210118. ISBN 978-1-4503-5657-2.
- Jurczyk, Tomasz; Deshmane, Amit; Choi, Jinho D. (5 February 2018). "Analysis of Wikipedia-based Corpora for Question Answering" (PDF). arXiv:1801.02073v2 [cs.CL].
- Du, Xinya; Cardie, Claire (15–20 July 2018). "Harvesting Paragraph-Level Question-Answer Pairs from Wikipedia" (PDF). In Miyao, Yusuke; Gurevych, Iryna (eds.). Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 56th Annual Meeting of the Association for Computational Linguistics (ACL 2018). Melbourne, Australia: Association for Computational Linguistics. pp. 1907–1917. arXiv:1805.05942v1. doi:10.18653/v1/P18-1177. ISBN 978-1-948087-32-2. ACL Anthology P18-1177. Archived (PDF) from the original on 29 August 2018. Retrieved 29 August 2018.
- Diefenbach, Dennis; Both, Andreas; Singh, Kamal; Maret, Pierre (17 June 2018). Polleres, Alex (ed.). "Towards a Question Answering System over the Semantic Web". Semantic Web – Interoperability, Usability, Applicability (Preprint). IOS Press. 0 (0 ). arXiv:1803.00832. eISSN 2210-4968. ISSN 1570-0844. Wikidata 50418915. Archived from the original on 29 August 2018. Retrieved 29 August 2018.