Wikipedia:Wikipedia Signpost/2016-09-06/Recent research: Difference between revisions

Browse history interactively

← Previous edit Next edit →

Content deleted Content added

VisualWikitext

Inline

Revision as of 04:07, 5 September 2016

Article display preview:

TKTK – TKTK

Recent research

The ethics of vandalism fighting and auto-creating articles

TKTK Nemo enim ipsam voluptatem, quia voluptas sit, aspernatur aut odit aut fugit, sed quia consequuntur magni dolores eos. TKTK

This is a draft of a potential Signpost article, and should not be interpreted as a finished piece. Its content is subject to review by the editorial team and ultimately by JPxG, the editor in chief. Please do not link to this draft as it is unfinished and the URL will change upon publication. If you would like to contribute and are familiar with the requirements of a Signpost article, feel free to be bold in making improvements!

This draft article ...

Y ... has a title defined.
The ethics of vandalism fighting and auto-creating articles
N ... has no blurb defined.
N ... is not yet ready to be copyedited.
N ... has not yet been copyedited.
N ... does not have an image.
N ... is not yet approved for publication.

Writer resources ...

The Newsroom (talk)

deadlines

Writing: 19 July 21:00 (-0 days left; 0%)

Publishing: 20 July 21:00 (1 day left; 6%)

Deadline has started. (refresh)

Last revised 04:07, 5 September 2016 (UTC) (7 years ago) by Peteforsyth (refresh)

← Back to Contents

View Latest Issue

6 September 2016

Recent research

AI-generated articles and research ethics; anonymous edits and vandalism fighting ethics

Contribute —

AI-generated Wikipedia articles give rise to debate about research ethics

At the International Joint Conference on Artificial Intelligence (IJCAI) – one of the prime AI conferences, if not the pre-eminent one – Banerjee and Mitra from Penn State published the paper "WikiWrite: Generating Wikipedia Articles Automatically".^[1]

The system described in the paper looks for red links in Wikipedia and classifies them based on their context. To find section titles, it then looks for similar existing articles. With these titles, the system searches the web for information, and eventually uses content summarization and a paraphrasing algorithm. The researchers uploaded 50 of these automatically created articles to Wikipedia, and found that 47 of them survived. Some were heavily edited after upload, others not so much.

While I was enthusiastic about the results, I was surprised by the suboptimal quality of the articles I reviewed – three that were mentioned in the paper. After a brief discussion with the authors, a wider discussion was initiated on the Wiki research list. This was followed by an entry on the English Wikipedia administrators' noticeboard (which includes a list of all accounts used for this particular research paper). The discussion led to the removal of most of the remaining articles.

The discussion concerned the ethical implications of the research, and using Wikipedia for such an experiment without the consent of Wikipedia contributors or readers. The first author of the paper was an active member of the discussion; he showed a lack of awareness of these issues, and appeared to learn a lot from the discussion. He promised to take these lessons to the relevant research community – a positive outcome.

In general, this sets an example for engineers and computer-science engineers, who often show a lack of awareness of certain ethical issues in their research. Computer scientists are typically trained to think about bits and complexities, and rarely discuss in depth how their work impacts human lives. Whether it's social networks experimenting with the mood of their users, current discussions of biases in machine-learned models, or the experimental upload of automatically created content in Wikipedia without community approval, computer science has generally not reached the level of awareness of some other sciences for the possible effects of their research on human subjects.

Even in Wikipedia, there's no clear-cut, succinct Wikipedia policy I could have pointed the researchers to. The use of sockpuppets – an incidental side-effect of the research – was a clear violation of policy. WP:POINT was a stretch to cover the situation at hand. In the end, what we can suggest to researchers is to check back with the Wikimedia Research list. A lot of people there have experience with designing research plans with the community in mind, and it can help to avoid uncomfortable situations.

See also our 2015 review of a related paper coauthored by the same authors: "Bot detects theatre play scripts on the web and writes Wikipedia articles about them" and other similarly themed papers they have published since then: "WikiKreator: Automatic Authoring of Wikipedia Content"^[2], "WikiKreator: Improving Wikipedia Stubs Automatically"^[3], "Filling the Gaps: Improving Wikipedia Stubs"^[4]. DV

Ethics researcher: Vandal fighters should not be allowed to see whether an edit was made anonymously

A paper^[5] in the journal Ethics and Information Technology examines the "system of surveillance" that the English Wikipedia has built up over the years to deal with vandalism edits. The author, Paul B. de Laat from the University of Gröningen, presents an interesting application of a theoretical framework by US law scholar Frederick Schauer that focuses on the concepts of rule enforcement and profiling. While providing justification for the system's efficacy and largely absolving it of some of the objections that are commonly associated with the use of profiling in e.g. law enforcement, the paper ultimately argues that in its current form, it violates an alleged "social contract" on Wikipedia by not treating anonymous and logged-in edits equally. While generally well-informed about both the practice and the academic research of vandalism fighting, the paper unfortunately fails to connect to an existing discussion about very much the same topic – potential biases of artificial intelligence-based anti-vandalsm tools against anonymous edits – that was begun last year by the researchers developing ORES (an edit review tool that was just made available to all English Wikipedia users (see this week's Technology report) and most recently presented in the August 2016 WMF research showcase.

The paper first gives an overview of the various anti-vandalism tools and bots in use, recapping an earlier paper^[6] where de Laat had already asked whether these are "eroding Wikipedia’s moral order" (following an even earlier 2014 paper in which he argued that new-edit patrolling "raises a number of moral questions that need to be answered urgently"). There, de Laat's concerns included the fact that some stronger tools (rollback, Huggle, and STiki) are available only to trusted users and "cause a loss of the required moral skills in relation to newcomers", and that they a lack of transparency about how the tools operate (in particular when more sophisticated artificial intelligence/machine learning algorithms such as neural networks are used). The present paper expands on a separate but related concern, about the use of "profiling" to pre-select which recent edits will be subject to closer human review. The author emphasizes that on Wikipedia this usually does not mean person-based offender profiling (building profiles of individuals committing vandalism), citing only one exception in form of a 2015 academic paper – cf. our review: "Early warning system identifies likely vandals based on their editing behavior". Rather, "the anti-vandalism tools exemplify the broader type of profiling" that focuses on actions. Based on Schauer's work, the author asks the following questions:

"Is this profiling profitable, does it bring the rewards that are usually associated with it?"
"is this profiling approach towards edit selection justified? In particular, do any of the dimensions in use raise moral objections? If so, can these objections be met in a satisfactory fashion, or do such controversial dimensions have to be adapted or eliminated?"

To answer the first question, the author turns to Schauer's work on rules, in a brief summary that is worth reading for anyone interested in Wikipedia policies and guidelines – although de Laat instead applies the concept to the "procedural rules" implicit in vandalism profiling (such as that anonymous edits are more likely to be worth scrutinizing). First, Schauer "resolutely pushes aside the argument from fairness: decision-making based on rules can only be less just than deciding each case on a particularistic basis ". (For example, a restaurant's "No Dogs Allowed" rule will treat some dogs unfairly that are so well-behaved that their presence won't create problems, while not prohibiting much more dangerous animals such as snakes.) Instead, the existence of rules have to be justified by other arguments, of which Schauer presents four:

Rules "create reliability/predictability for those affected by the rule: rule-followers as well as rule-enforcers".
Rules "promote more efficient use of resources by rule-enforcers" (e.g. in case of a speeding car driver, traffic police and judges can apply a simple speed limit instead having to prove in detail that an instance of driving was dangerous).
Rules, if simple enough, reduce the problem of "risk-aversion" by enforcers, who are much more likely to make mistakes and face repercussions if they have to make case by case decisions.
Rules create stability, but also present "an impediment to change; they entrench the status-quo. If change is on a society’s agenda, the stability argument turns into an argument against having (simple) rules."

The author cautions that these arguments have to be reinterpreted when applying them to the aforementioned vandalism profiling, because it consists of "procedural rules" (which edits should be selected for inspection) rather than "substantive rules" (which edits should be reverted as vandalism, which animals should be disallowed from the restaurant). While in the case of substantive rules, their absence would mean having to judge everything on a case-by-case basis, the author asserts that procedural rules arise in a situation where the alternative would be to to not judge at all in many cases: Because "we have no means at our disposal to check and pass judgment on all of them; a selection of a kind has to be made. So it is here that profiling comes in". With that qualification, Schauer's second argument provides justification for "Wikipedian profiling [because it] turns out to be amazingly effective", starting with the autonomous bots that auto-revert with an (aspired) 1:1000 false-positive rate. De Laat also interprets "the Schauerian argument of reliability/predictability for those affected by the rule" in favor of vandalism profiling. Here, though, he fails to explain the benefits of vandals being able to predict which kind of edits will be subject to scrutiny. This also calls into question his subsequent remark that "it is unfortunate that the anti-vandalism system in use remains opaque to ordinary users". The remaining two of Schauer's four arguments are judged as less pertinent. But overall the paper concludes that it is possibile to justify the existence of vandalism profiling rules as beneficial via Schauer's theoretical framework.

Next, de Laat turns to question 2, on whether vandalism profiling is also morally justified. Here he relies on later work by Schauer, from a 2003 book, "Profiles, Probabilities, and Stereotypes", that studies such matters as profiling by tax officials (selecting which taxpayers have to undergo an audit), airport security (selecting passengers for screening) and decision-making by police officers (e.g. selecting cars for traffic stops). While profiling of some kind is a necessity for all these officials, the particular characteristics (dimensions) used for profiling can be highly problematic (see e.g. Driving While Black). For de Laat's study of Wikipedia profiling, "two types of complications are important: (1) possible ‘overuse’ of dimension(s) (an issue of profile effectiveness) and (2) social sensibilities associated with specific dimension(s) (a social and moral issue)." Overuse can mean relying on stereotypes that have no basis in reality, or over-reliance on some dimensions that, while having a non-spurious correlation with the deviant behavior, are over-emphasized at the expense of other relevant characteristics because they are more visible or salient to the profile. E.g. while Schauer considers that it may be justified for "airport officials looking for explosives [to] single out for inspection the luggage of younger Muslim men of Middle Eastern appearance", it would be an over-use if "officials ask all Muslim men and all men of Middle Eastern origin to step out of line to be searched", thus reducing their effectiveness by neglecting other passenger characteristics. This is also an example for the second type of complication profiling, where the selected dimensions are socially sensitive – indeed, for the specific case of luggage screening in the US, "the factors of race, religion, ethnicity, nationality, and gender have expressly been excluded from profiling" since 1997.

Applying this to the case of Wikipedia's anti-vandalism efforts, de Laat first observes that complication (1) (overuse) is not a concern for fully automated tools like ClueBotNG – obviously their algorithm applies the existing profile directly without a human intervention that could introduce this kind of bias. For Huggle and STiki, however, "I see several possibilities for features to be overused by patrollers, thereby spoiling the optimum efficacy achievable by the profile embedded in those tools." This is because tools do not just use these features in their automatic pre-selection of edits to be reviewed, but expose at least the fact whether an edit was anonymous to the human patroller in the edit review interface. (The paper examines this in detail for both tools, also observing that Huggle presents more opportunities for this kind of overuse, while STiki is more restricted. However, there seem to have been no attempt to study empirically whether this overuse actually occurs.)

Regarding complication (2), whether some of the features used for vandalism profiling are socially sensitive, de Laat highlights that they include some amount of discrimination by nationality: IP edits geolocated to the US, Canada, and Australia have been found to contain vandalism more frequently and are thus more likely to be singled out for inspection. However, he does not consider this concern "strong enough to warrant banning the country-dimension and correspondingly sacrifice some profiling efficacy", chiefly because there do not appear to be a lot of nationalistic tensions within the English Wikipedia community that could be stirred up by this.

In contrast, de Laat argues that "the targeting of contributors who choose to remain anonymous ... is fraught with danger since anons already constitute a controversial group within the Wikipedian community." Still, he acknowledges the "undisputed fact" that the ratio of vandalism is much higher among anonymous edits. Also, he rejects the concern that they might be more likely to be the victim of false positives: Template:Signpost quote

With this said, de Laat still makes the controversial call "that the anonymous-dimension should be banned from all profiling efforts" – including removing it from the scoring algorithms of Huggle, STiki and ClueBotNG. Instead of concerns about individual harm,

"my main argument for the ban is a decidedly moral one. From the very beginning the Wikipedian community has operated on the basis of a 'social contract' that makes no distinction between anons and non-anons—all are citizens of equal stature. ... In sum, the express profiling of anons turns the anonymity dimension from an access condition into a social distinction; the Wikipedian community should refrain from institutionalizing such a line of division. Notice that I argue, in effect, that the Wikipedian community has only two choices: either accept anons as full citizens or not; but there is no morally defensible social contract in between."

While the paper is otherwise rich in citations and details, it completely fails to provide evidence for the existence of this alleged contract. While it is true that "the ability of almost anyone to edit (most) articles without registration" forms part of Wikipedia's founding principles (a principle that this reviewer strongly agrees with), the "equal stature" part seems to be de Laat's own invention – there is a long list of things that, by longstanding community consensus, require the use of an account (which after all is freely available to everyone, without even requiring an email address). Most of these restrictions – say, the inability to create new articles or being prevented from participating in project governance during admin or arbcom votes – seem much more serious than the vandalism profiling that is the topic of de Laat's paper.

Briefly

Conferences and events

See the research events page on Meta-wiki for upcoming conferences and events, including submission deadlines.

Other recent publications

A list of other recent publications that could not be covered in time for this issue—contributions are always welcome for reviewing or summarizing newly published research.

"Large SMT Data-sets Extracted from Wikipedia"^[7] From the abstract: "The article presents experiments on mining Wikipedia for extracting SMT [ statistical machine translation ] useful sentence pairs in three language pairs. [...] The optimized SMT systems were evaluated on unseen test-sets also extracted from Wikipedia. As one of the main goals of our work was to help Wikipedia contributors to translate (with as little post editing as possible) new articles from major languages into less resourced languages and vice-versa, we call this type of translation experiments 'in-genre' translation. As in the case of 'in-domain' translation, our evaluations showed that using only 'in-genre' training data for translating same genre new texts is better than mixing the training data with 'out-of-genre' (even) parallel texts."
"Recognizing Biographical Sections in Wikipedia"^[8] From the abstract: "Thanks to its coverage and its availability in machine-readable format, [Wikipedia] has become a primary resource for large scale research in historical and cultural studies. In this work, we focus on the subset of pages describing persons, and we investigate the task of recognizing biographical sections from them: given a person’s page, we identify the list of sections where information about her/his life is present [as opposed to nonbiographical sections, e.g. 'Early Life' but not 'Legacy' or 'Selected writings']."
"'A Spousal Relation Begins with a Deletion of engage and Ends with an Addition of divorce': Learning State Changing Verbs from Wikipedia Revision History."^[9] From the abstract: "We propose to learn state changing verbs [such as 'born', 'died', 'elected', 'married'] from Wikipedia edit history. When a state-changing event, such as a marriage or death, happens to an entity, the infobox on the entity's Wikipedia page usually gets updated. At the same time, the article text may be updated with verbs either being added or deleted to reflect the changes made to the infobox. [...] We observe in our experiments that when state-changing verbs are added or deleted from an entity's Wikipedia page text, we can predict the entity's infobox updates with 88% precision and 76% recall." TB
"Extracting Representative Phrases from Wikipedia Article Sections"^[10] From the abstract: "Since [Wikipedia's] long articles are taking time to read, as well as section titles are sometimes too short to capture comprehensive summarization, we aim at extracting informative phrases that readers can refer to."
"Finding Member Articles for Wikipedia Lists"^[11] From the abstract: "... for a given Wikipedia article and list, we determine whether the article can be added to the list. Its solution can be utilized on automatic generation of lists, as well as generation of categories based on lists, to help self-organization of knowledge structure. In this paper, we discuss building classifiers for judging on whether an article belongs to a list or not, where features are extracted from various components including list titles, leading sections, as well as texts of member articles. ... We report our initial evaluation results based on Bayesian and other classifiers, and also discuss feature selection."

References

^ Siddhartha Banerjee, Prasenjit Mitra, "WikiWrite: Generating Wikipedia Articles Automatically".
^ Banerjee, Siddhartha; Mitra, Prasenjit (October 2015). "WikiKreator: Automatic Authoring of Wikipedia Content". AI Matters. 2 (1): 4–6. doi:10.1145/2813536.2813538. ISSN 2372-3483.
^ Banerjee, Siddhartha and Mitra, Prasenjit: "WikiKreator: Improving Wikipedia Stubs Automatically, Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing" (Volume 1: Long Papers), July 2015, Beijing, China, Association for Computational Linguistics, pages 867–877,
^ Banerjee, Siddhartha; Mitra, Prasenjit (2015). "Filling the Gaps: Improving Wikipedia Stubs". Proceedings of the 2015 ACM Symposium on Document Engineering. DocEng '15. New York, NY, USA: ACM. pp. 117–120. doi:10.1145/2682571.2797073. ISBN 9781450333078. {{cite conference}}: Unknown parameter |booktitle= ignored (|book-title= suggested) (help)
^ Laat, Paul B. (30 April 2016). "Profiling vandalism in Wikipedia: A Schauerian approach to justification". Ethics and Information Technology: 1–18. doi:10.1007/s10676-016-9399-8. ISSN 1572–8439 1388-1957, 1572–8439. {{cite journal}}: Check |issn= value (help)
^ Laat, Paul B. de (2 September 2015). "The use of software tools and autonomous bots against vandalism: eroding Wikipedia's moral order?". Ethics and Information Technology. 17 (3): 175–188. doi:10.1007/s10676-015-9366-9. ISSN 1572–8439 1388-1957, 1572–8439. {{cite journal}}: Check |issn= value (help)
^ Tufiş, Dan; Ion, Radu; Dumitrescu, Ştefan; Ştefănescu2, Dan (26 May 2014). "Large SMT Data-sets Extracted from Wikipedia" (PDF). Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14). TUFI 14.103. ISBN 978-2-9517408-8-4. {{cite conference}}: Unknown parameter |booktitle= ignored (|book-title= suggested) (help)CS1 maint: numeric names: authors list (link)
^ Aprosio, Alessio Palmero; Tonelli, Sara (17 September 2015). "Recognizing Biographical Sections in Wikipedia". Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. Lisbon, Portugal. pp. 811–816. {{cite conference}}: Unknown parameter |booktitle= ignored (|book-title= suggested) (help)
^ Nakashole, Ndapa; Mitchell, Tom; Wijaya, Derry (2015). "A Spousal Relation Begins with a Deletion of engage and Ends with an Addition of divorce": Learning State Changing Verbs from Wikipedia Revision History (PDF). Proceedings of EMNLP 2015. Lisbon, Portugal. pp. 518–523.
^ Shan Liu, Mizuho Iwaihara: Extracting Representative Phrases from Wikipedia Article Sections, DEIM Forum 2016 C3-6. http://db-event.jpn.org/deim2016/papers/314.pdf
^ Shuang Sun, Mizuho Iwaihara: Finding Member Articles for Wikipedia Lists. DEIM Forum 2016 C3-3. http://db-event.jpn.org/deim2016/papers/184.pdf

Supplementary references:

← Previous "Recent research"

In this issue

6 September 2016 (all comments)

Discuss this story

These comments are automatically transcluded from this article's talk page. To follow comments, add the page to your watchlist. If your comment has not appeared here, you can try purging the cache.

No comments on this yet? What a fascinating story. A huge helping of "thank-you" to the authors, with a generous side of gratitude. Well done. 78.26 _{(spin me / revolutions)} 15:37, 7 September 2016 (UTC)[reply]

I've got one, but it's going to upset some people. The first story here is about a kind of Turing test ... an attempt to get a machine to successfully mimic the kinds of things a human would write. This kind of research is proceeding at light speed; no one in the field is betting that machines won't be much, much better at it 10 years from now, across a range of applications. This will inevitably have dramatic consequences for Wikipedia. We can expect to see a variety of bad actors, including convincing, machine-generated sockpuppets that promote their master's articles, and show up at community votes, and generally cause mayhem. We can expect to see good actors, who create tools that efficiently do a variety of tasks that have to be done manually now, including tools that fight the bad actors. Most of the machine-users will probably be neither good nor bad; they might just be curious about how the software will work, as these researchers seem to have been, or they might be using these tools in other spheres of their lives, and never stop to think that we might object to use of those tools on Wikipedia. One thing that concerns me: if we yell at every neutral editor and researcher who uses similar tools and tell them we think they're scum (and that happened in this case, a little bit), we might, over time, convert all the neutral actors into bad actors. - Dank (push to talk) 17:09, 7 September 2016 (UTC)[reply]

Ever since seeing CGP Grey's "Humans Need Not Apply" video and reading Nick Bostrom's Superintelligence: Paths, Dangers, Strategies upon which the video was based, I have become increasingly concerned with changes in this industry. Those of us that enjoy writing an encyclopedia will not survive for long against AI that will generate a free encyclopedia for those who are only consumers. Clearly as a biased humanities student I have little regard to the professionalism of engineers mucking around in Wikipedia as they selfishly seek to solve a perceived problem without a care for either the human editors or the larger enterprise. To that end, I have no qualms about biting or "profiling" so-called neutral actors. Anyone that's not an encyclopedist is a bad actor, anyway. Chris Troutman (talk) 13:37, 8 September 2016 (UTC)[reply]

FWIW, the Quill software mentioned at the 8:54 mark in that video is described at https://www.narrativescience.com/quill. Google has a team headed by Ray Kurzweil that plans to deliver a customizable chatbot by the end of this year. So the threats (and opportunities) already exist to some extent. - Dank (push to talk) 14:14, 8 September 2016 (UTC)[reply]

Regarding the commentary on the paper by "de Laat", I'm sure it's a well thought out work, but I fail to see why this form of profiling (or as I think of it: filtering) is "eroding the moral order". Editing Wikipedia is not an innate human right, it's a privilege that can be taken away. To me the anti-vandalism tactics are somewhat equivalent to requiring seat belts to drive on the freeway; the police can profile the unbelted drivers, thereby focusing their efforts on (presumably) higher risk targets. How is that eroding the moral order? de Laat's reasoning seems more appropriate for a court of law. Praemonitus (talk) 19:59, 8 September 2016 (UTC)[reply]
- I've long been of the opinion that IP vandals should be permanently banned after three strikes, and more generally that we ought to treat anonymous edits differently from edits by named editors. Bearian (talk) 17:57, 12 September 2016 (UTC)[reply]
  - Maybe that would be fair, but is it possible? We only have the technology to block IP addresses, an individual vandal is likely to move on and if we've permanently blocked their former IP address it is no skin of their nose. If anything they have provoked us into permanently disabling editing for future users of that IP address. Ϣere SpielChequers 09:31, 25 September 2016 (UTC)[reply]
Interesting article. I'd have critiqued the IP profiling article differently. Firstly the question of intrusiveness, I can understand drivers getting annoyed if they are pulled over and breathalysed when sober. Antivandalism patrol is more like the highway patrol that looks at all the traffic and then goes after the car that is weaving all over the road or speeding. Unless you get thanked or your edit accepted you don't normally notice the time someone checks your edit and decides it isn't vandalism. Secondly it conflated the divide into IP v registered. In reality the divide is three way, IP, new account, trusted account. The main difference is in the way we treat the regulars as opposed to newbies and IPs. In effect we are like an airport with special light touch express lines for frequent flyers and staff, or a barman who doesn't repeatedly check the age of the regulars - prove you are a trusted known quantity and we will focus our security time elsewhere. If you make the comparison between IP editors and Newbies then I'm not sure the IPs have a case to gripe. In practice an IP vandal will usually get a 31 hour block for something that a newbie would get an indefinite block for. A better analogy for the IP editors and newbies is with office blocks that operate a keyfob system. I wouldn't be surprised if the researchers work in an environment with such a system. If so I challenge them to persuade their University or other workplace to drop special treatment for regulars, no side doors that only work with a fob - everyone gets to use the main entrance and sign in at reception. Anons, such as people wearing full face motorcycle helmets and having forgotten their keyfob, get the same access as everyone else. Such a system might work OK at a public library or in a village on a sparsely populated island, but not in an organisation with hundreds let alone thousands of people and in a big city. Ϣere SpielChequers 09:31, 25 September 2016 (UTC)[reply]
- It is common for organizations to issue members identification cards or badges. People who have such IDs are treated differently from visitors who do not. They can enter and leave buildings and areas within buildings without checking in at a front desk. People who do not display a badge in a work place may be politely challenged by employees or security guards ("Can I Help you?"). A library card allows one to take out books. A passport permits border crossing. Cards holders are trusted more by the organization that issues the card. None of this behavior is considered profiling or ethically dubious. Having a Wikipedia account is a form of ID. We can easily contact you if your behavior is inappropriate and block you if your bad behavior persists despite repeated warnings (3 at least). By contrast many IP addresses come from schools or cybercafes where the addresses are shared by multiple users, making warnings and blocks more difficult to deliver. A vandal may come from more than one IP address and can easily evade blocks. Finally vandalism as defined by our policy is clear cut stuff like deleting blocks of text or inserting obscenities or gibberish. Its removal is an unquestionable good. If one goes into a poor neighborhood and quietly picks up broken glass from public playgrounds without attracting any attention or making any fuss about it, would that present ethical problems? Even if the reality is that there is just as much or more broken glass in richer neighborhoods, the playgrounds cleaned up are still better for it.--agr (talk) 01:00, 29 September 2016 (UTC)[reply]

The Signpost is written by editors like you — join in!

Home

About

@@ Line 1: / Line 1: @@
+{{Use dmy dates|date=September 2016}}
 <noinclude>
 {{Signpost draft
@@ Line 19: / Line 20: @@
 ===AI-generated Wikipedia articles give rise to debate about research ethics===
-At the [[International Joint Conference on Artificial Intelligence]] (IJCAI) – one of, if not the prime AI conferences – Banerjee and Mitra from [[Pennsylvania State University|Penn State]] published the paper "WikiWrite: Generating Wikipedia Articles Automatically".<ref>Siddhartha Banerjee, Prasenjit Mitra, [http://www.ijcai.org/Abstract/16/389 "WikiWrite: Generating Wikipedia Articles Automatically"].</ref>
+At the [[International Joint Conference on Artificial Intelligence]] (IJCAI) – one of the prime AI conferences, if not the pre-eminent one – Banerjee and Mitra from [[Pennsylvania State University|Penn State]] published the paper "WikiWrite: Generating Wikipedia Articles Automatically".<ref>Siddhartha Banerjee, Prasenjit Mitra, [http://www.ijcai.org/Abstract/16/389 "WikiWrite: Generating Wikipedia Articles Automatically"].</ref>
-The system described in the paper looks for [[WP:Red link|red links]] in Wikipedia and then classifies them based on their context. It then looks for similar articles that already exist in order to find section titles. With these titles, the system searches the web for information, and eventually uses content summarization and a paraphrasing algorithm. The researchers uploaded 50 of these automatically created articles to Wikipedia, and found that 47 of them indeed survived. Some were heavily edited after upload, others not so much.
+The system described in the paper looks for [[WP:Red link|red links]] in Wikipedia and classifies them based on their context. To find section titles, it then looks for similar existing articles. With these titles, the system searches the web for information, and eventually uses content summarization and a paraphrasing algorithm. The researchers uploaded 50 of these automatically created articles to Wikipedia, and found that 47 of them survived. Some were heavily edited after upload, others not so much.
-While the reviewer was enthusiastic about the results. I was surprised by the subpar quality of the articles I reviewed -- three that were mentioned in the paper. After a brief discussion with the authors, a wider discussion was initiated on the [https://lists.wikimedia.org/pipermail/wiki-research-l/2016-August/005324.html Wiki Research list]. This was followed by an entry on the [[Wikipedia:Administrators' noticeboard/IncidentArchive931#Moving discussion from wikimedia research mailing list|English Wikipedia Administrators' noticeboard]] (which includes a list of all accounts used for this particular research paper). The discussion led to almost all remaining articles being removed except for a very few articles that had been improved over time by other editors.
+While I was enthusiastic about the results, I was surprised by the suboptimal quality of the articles I reviewed – three that were mentioned in the paper. After a brief discussion with the authors, a wider discussion was initiated on the Wiki research list. This was followed by an entry on the [[Wikipedia:Administrators' noticeboard/IncidentArchive931#Moving discussion from wikimedia research mailing list|English Wikipedia administrators' noticeboard]] (which includes a list of all accounts used for this particular research paper). The discussion led to the removal of most of the remaining articles.
-The discussion concerned the ethical implications of the research, and using Wikipedia for such an experiment without the consent of Wikipedia contributors or readers. The first author of the paper was an active member of the discussion; he showed unawareness of these issues, and appeared to learn a lot from the discussion. He promised to take these lessons to the relevant research community, which is certainly a positive outcome.
+The discussion concerned the ethical implications of the research, and using Wikipedia for such an experiment without the consent of Wikipedia contributors or readers. The first author of the paper was an active member of the discussion; he showed a lack of awareness of these issues, and appeared to learn a lot from the discussion. He promised to take these lessons to the relevant research community – a positive outcome.
-In general, this sets an example for engineers and computer science engineers who still often show a lack of awareness of certain ethical issues in their research. Computer scientists often are trained to think about bits and complexities, and rarely discuss in depth how their work impacts human lives. Whether it is social networks experimenting with the mood of their users, the current discussions of biases in machine-learned models, or the experimental and not-community-approved upload of automatically created content in Wikipedia,  computer science in general has not reached the level of awareness some other sciences have for the possible effects of their research on human subjects, at least as far as this reviewer can tell.
+In general, this sets an example for engineers and computer-science engineers, who often show a lack of awareness of certain ethical issues in their research. Computer scientists are typically trained to think about bits and complexities, and rarely discuss in depth how their work impacts human lives. Whether it's social networks experimenting with the mood of their users, current discussions of biases in machine-learned models, or the experimental upload of automatically created content in Wikipedia without community approval, computer science has generally not reached the level of awareness of some other sciences for the possible effects of their research on human subjects.
-Even in Wikipedia, in the end there was no clear-cut, succinct Wikipedia policy that the researchers could be pointed to.  The use of sockpuppets was a clear violation of policy, but an incidental side-effect to the research. [[WP:POINT]] was a stretch to cover the situation at hand. In the end, what we can suggest to researchers would be to check back with the Wikimedia Research list. A lot of people there have experience with designing research plans with the community in mind, and it can help avoiding uncomfortable situations.
+Even in Wikipedia, there's no clear-cut, succinct Wikipedia policy I could have pointed the researchers to.  The use of sockpuppets – an incidental side-effect of the research – was a clear violation of policy. [[WP:POINT]] was a stretch to cover the situation at hand. In the end, what we can suggest to researchers is to check back with the Wikimedia Research list. A lot of people there have experience with designing research plans with the community in mind, and it can help to avoid uncomfortable situations.
-''See also our 2015 review of a related paper coauthored by the same authors: "[[m:Research:Newsletter/2015/January#Bot_detects_theatre_play_scripts_on_the_web_and_writes_Wikipedia_articles_about_them|Bot detects theatre play scripts on the web and writes Wikipedia articles about them]]" and other similarly themed papers they have published since then: "WikiKreator: Automatic Authoring of Wikipedia Content"<ref>{{Cite journal| doi = 10.1145/2813536.2813538| issn = 2372-3483| volume = 2| issue = 1| pages = 4-6| last1 = Banerjee| first1 = Siddhartha| last2 = Mitra| first2 = Prasenjit| title = WikiKreator: Automatic Authoring of Wikipedia Content| journal = AI Matters| date = October 2015| url = http://doi.acm.org/10.1145/2813536.2813538}} {{closed access}}</ref>, "WikiKreator: Improving Wikipedia Stubs Automatically"<ref>Banerjee, Siddhartha  and  Mitra, Prasenjit: [http://www.aclweb.org/anthology/P15-1084 "WikiKreator: Improving Wikipedia Stubs Automatically, Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing]" (Volume 1: Long Papers), July 2015, Beijing, China, Association for Computational Linguistics, pages 867-877, </ref>, "Filling the Gaps: Improving Wikipedia Stubs"<ref>{{Cite conference| publisher = ACM| doi = 10.1145/2682571.2797073| isbn = 9781450333078| pages = 117-120| last1 = Banerjee| first1 = Siddhartha| last2 = Mitra| first2 = Prasenjit| title = Filling the Gaps: Improving Wikipedia Stubs| booktitle = Proceedings of the 2015 ACM Symposium on Document Engineering| location = New York, NY, USA| series = DocEng '15| date = 2015| url = http://doi.acm.org/10.1145/2682571.2797073}} {{closed access}}</ref>.'' <small>[[User:Denny|DV]]</small>
+''See also our 2015 review of a related paper coauthored by the same authors: "[[m:Research:Newsletter/2015/January#Bot_detects_theatre_play_scripts_on_the_web_and_writes_Wikipedia_articles_about_them|Bot detects theatre play scripts on the web and writes Wikipedia articles about them]]" and other similarly themed papers they have published since then: "WikiKreator: Automatic Authoring of Wikipedia Content"<ref>{{Cite journal| doi = 10.1145/2813536.2813538| issn = 2372-3483| volume = 2| issue = 1| pages = 4–6| last1 = Banerjee| first1 = Siddhartha| last2 = Mitra| first2 = Prasenjit| title = WikiKreator: Automatic Authoring of Wikipedia Content| journal = AI Matters| date = October 2015| url = http://doi.acm.org/10.1145/2813536.2813538}} {{closed access}}</ref>, "WikiKreator: Improving Wikipedia Stubs Automatically"<ref>Banerjee, Siddhartha  and  Mitra, Prasenjit: [http://www.aclweb.org/anthology/P15-1084 "WikiKreator: Improving Wikipedia Stubs Automatically, Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing]" (Volume 1: Long Papers), July 2015, Beijing, China, Association for Computational Linguistics, pages 867–877,</ref>, "Filling the Gaps: Improving Wikipedia Stubs"<ref>{{Cite conference| publisher = ACM| doi = 10.1145/2682571.2797073| isbn = 9781450333078| pages = 117–120| last1 = Banerjee| first1 = Siddhartha| last2 = Mitra| first2 = Prasenjit| title = Filling the Gaps: Improving Wikipedia Stubs| booktitle = Proceedings of the 2015 ACM Symposium on Document Engineering| location = New York, NY, USA| series = DocEng '15| date = 2015| url = http://doi.acm.org/10.1145/2682571.2797073}} {{closed access}}</ref>.'' <small>[[User:Denny|DV]]</small>
 ===Ethics researcher: Vandal fighters should not be allowed to see whether an edit was made anonymously===
-A paper<ref>{{Cite journal| doi = 10.1007/s10676-016-9399-8| issn = 1388-1957, 1572-8439| pages = 1-18| last = Laat| first = Paul B.| title = Profiling vandalism in Wikipedia: A Schauerian approach to justification| journal = Ethics and Information Technology| date = 2016-04-30| url = http://link.springer.com/article/10.1007/s10676-016-9399-8}}</ref> in the journal ''Ethics and Information Technology'' examines the "system of surveillance" that the English Wikipedia has built up over the years to deal with vandalism edits. The author, Paul B. de Laat from the University of Groningen, presents an interesting application of a theoretical framework by US law scholar [[Frederick Schauer]] that focuses on the concepts of rule enforcement and profiling. While providing justification of the system's efficacy and largely absolving it of some of the objections that are commonly associated with the use of profiling in e.g. law enforcement, the paper ultimately argues that in its current form, it violates an alleged "social contract" on Wikipedia by not treating anonymous and logged-in edits equally. While generally well-informed about both the practice and the academic research of vandalism fighting, the paper unfortunately fails to connect to an existing discussion about very much the same topic - potential biases of artificial intelligence-based anti-vandalsm tools against anonymous edits - that was begun last year by the researchers developing ORES (an edit review tool that was just made available to all English Wikipedia users [ADD LINK TO THIS ISSUE'S TECHNOLOGY REPORT]) and most recently presented in the [[mw:Wikimedia_Research/Showcase#August_2016|August 2016 WMF research showcase]].
+A paper<ref>{{Cite journal| doi = 10.1007/s10676-016-9399-8| issn = 1388-1957, 1572–8439| pages = 1–18| last = Laat| first = Paul B.| title = Profiling vandalism in Wikipedia: A Schauerian approach to justification| journal = Ethics and Information Technology| date = 30 April 2016| url = http://link.springer.com/article/10.1007/s10676-016-9399-8}}</ref> in the journal ''Ethics and Information Technology'' examines the "system of surveillance" that the English Wikipedia has built up over the years to deal with vandalism edits. The author, Paul B. de Laat from the University of Gröningen, presents an interesting application of a theoretical framework by US law scholar [[Frederick Schauer]] that focuses on the concepts of rule enforcement and profiling. While providing justification for the system's efficacy and largely absolving it of some of the objections that are commonly associated with the use of profiling in e.g. law enforcement, the paper ultimately argues that in its current form, it violates an alleged "social contract" on Wikipedia by not treating anonymous and logged-in edits equally. While generally well-informed about both the practice and the academic research of vandalism fighting, the paper unfortunately fails to connect to an existing discussion about very much the same topic – potential biases of artificial intelligence-based anti-vandalsm tools against anonymous edits – that was begun last year by the researchers developing ORES (an edit review tool that was just made available to all English Wikipedia users (see this week's [[Wikipedia:Wikipedia_Signpost/Next_issue/Technology_report|Technology report]]) and most recently presented in the [[mw:Wikimedia_Research/Showcase#August_2016|August 2016 WMF research showcase]].
-The paper first gives an overview over the various [[Wikipedia:Cleaning up vandalism/Tools|anti-vandalism tools]] and bots in use, recapping an earlier paper<ref>{{Cite journal| doi = 10.1007/s10676-015-9366-9| issn = 1388-1957, 1572-8439| volume = 17| issue = 3| pages = 175–188| last = Laat| first = Paul B. de| title = The use of software tools and autonomous bots against vandalism: eroding Wikipedia’s moral order?| journal = Ethics and Information Technology| date = 2015-09-02| url = http://link.springer.com/article/10.1007/s10676-015-9366-9}}</ref> where de Laat had already asked whether these are "eroding Wikipedia’s moral order" (following an even earlier [[m:Research:Newsletter/2014/May#cite_ref-14|2014 paper]] where he had argued that new edit patrolling "raises a number of moral questions that need to be answered urgently"). There, de Laat's concerns included the fact that some stronger tools (rollback, Huggle, and STiki) are only available to trusted users, that they "cause a loss of the required moral skills in relation to newcomers", and a lack of transparency about how the tools operate (in particular when more sophisticated artificial intelligence/machine learning algorithms such as neural networks are used). The present paper expands on a separate but related concern, about the use of "profiling" to pre-select which recent edits will be subject to closer human review. The author takes care to emphasize that on Wikipedia this usually does not mean person-based [[offender profiling]] (building profiles of individuals committing vandalism), citing only one exception in form of a 2015 academic paper - cf. our review: "[[m:Research:Newsletter/2015/August#Early_warning_system_identifies_likely_vandals_based_on_their_editing_behavior|Early warning system identifies likely vandals based on their editing behavior]]". Rather, "the anti-vandalism tools exemplify the broader type of profiling" that focuses on actions. Based on Schauer's work, the author asks the following questions:
+The paper first gives an overview of the various [[Wikipedia:Cleaning up vandalism/Tools|anti-vandalism tools]] and bots in use, recapping an earlier paper<ref>{{Cite journal| doi = 10.1007/s10676-015-9366-9| issn = 1388-1957, 1572–8439| volume = 17| issue = 3| pages = 175–188| last = Laat| first = Paul B. de| title = The use of software tools and autonomous bots against vandalism: eroding Wikipedia’s moral order?| journal = Ethics and Information Technology| date = 2 September 2015| url = http://link.springer.com/article/10.1007/s10676-015-9366-9}}</ref> where de Laat had already asked whether these are "eroding Wikipedia’s moral order" (following an even earlier [[m:Research:Newsletter/2014/May#cite_ref-14|2014 paper]] in which he argued that new-edit patrolling "raises a number of moral questions that need to be answered urgently"). There, de Laat's concerns included the fact that some stronger tools (rollback, Huggle, and STiki) are available only to trusted users and "cause a loss of the required moral skills in relation to newcomers", and that they a lack of transparency about how the tools operate (in particular when more sophisticated artificial intelligence/machine learning algorithms such as neural networks are used). The present paper expands on a separate but related concern, about the use of "profiling" to pre-select which recent edits will be subject to closer human review. The author emphasizes that on Wikipedia this usually does not mean person-based [[offender profiling]] (building profiles of individuals committing vandalism), citing only one exception in form of a 2015 academic paper – cf. our review: "[[m:Research:Newsletter/2015/August#Early_warning_system_identifies_likely_vandals_based_on_their_editing_behavior|Early warning system identifies likely vandals based on their editing behavior]]". Rather, "the anti-vandalism tools exemplify the broader type of profiling" that focuses on actions. Based on Schauer's work, the author asks the following questions:
 # "Is this profiling profitable, does it bring the rewards that are usually associated with it?"
-# "... is this profiling approach towards edit selection justified? In particular, do any of the dimensions in use raise moral objections? If so, can these objections be met in a satisfactory fashion, or do such controversial dimensions have to be adapted or eliminated?"
+# "is this profiling approach towards edit selection justified? In particular, do any of the dimensions in use raise moral objections? If so, can these objections be met in a satisfactory fashion, or do such controversial dimensions have to be adapted or eliminated?"
-To answer the first question, the author turns to Schauer's work on rules, summarizing it in a short [https://link.springer.com/article/10.1007/s10676-016-9399-8#Sec5 section] that is worth reading for anyone interested in Wikipedia policies and guidelines in general; although de Laat instead applies the concept to the "procedural rules" implicit in vandalism profiling (such as that anonymous edits are more likely to be worth scrutinizing).
+To answer the first question, the author turns to Schauer's work on rules, in a brief summary that is worth reading for anyone interested in Wikipedia policies and guidelines – although de Laat instead applies the concept to the "procedural rules" implicit in vandalism profiling (such as that anonymous edits are more likely to be worth scrutinizing).
-First of all, Schauer "resolutely pushes aside the argument from fairness: decision-making based on rules can only be less just than deciding each case on a particularistic basis ". (For example, a restaurant's "No Dogs Allowed" rule will treat some dogs unfairly that are so well-behaved that their presence won't create problems, while not prohibiting much more dangerous animals such as snakes.) Instead, the existence of rules have to be justified by other arguments, of which Schauer presents four:
+First, Schauer "resolutely pushes aside the argument from fairness: decision-making based on rules can only be less just than deciding each case on a particularistic basis ". (For example, a restaurant's "No Dogs Allowed" rule will treat some dogs unfairly that are so well-behaved that their presence won't create problems, while not prohibiting much more dangerous animals such as snakes.) Instead, the existence of rules have to be justified by other arguments, of which Schauer presents four:
 *Rules "create ''reliability/predictability'' for those affected by the rule: rule-followers as well as rule-enforcers".
-*Rules "promote more efficient use of resources by rule-enforcers" (e.g. in case of a speeding car driver, traffic police and judges can apply a simple speed limit instead having to  prove in detail that dangerous driving happened).
+*Rules "promote more efficient use of resources by rule-enforcers" (e.g. in case of a speeding car driver, traffic police and judges can apply a simple speed limit instead having to prove in detail that an instance of driving was dangerous).
-*Rules, if they are simple enough, reduce the problem of "risk-aversion" by enforcers, who are much more likely to make mistakes and face repercussions if they have to make case by case decisions.
+*Rules, if simple enough, reduce the problem of "risk-aversion" by enforcers, who are much more likely to make mistakes and face repercussions if they have to make case by case decisions.
-*Rules create stability, which however also presents "an impediment to change; it entrenches the status-quo. If change is on a society’s agenda, the stability argument turns into an argument against having (simple) rules."
+*Rules create stability, but also present "an impediment to change; they entrench the status-quo. If change is on a society’s agenda, the stability argument turns into an argument against having (simple) rules."
-The author cautions that these arguments have to be reinterpreted when applying them to the aforementioned vandalism profiling, because it consists of "procedural rules" (which edits should be selected for inspection) rather than "substantive rules" (which edits should be reverted as vandalism, which animals should be disallowed from the restaurant, etc.). Whereas in the case of substantive rules, their absence would mean having to judge everything on a case-by-case basis, the author asserts that procedural rules arise in a situation where the alternative would be to to not judge at all in many cases: Because "we have no means at our disposal to check and pass judgment on all of them; a selection of a kind has to be made. So it is here that profiling comes in". With that qualification, Schauer's second argument provides justification for "Wikipedian profiling [because it] turns out to be amazingly effective", starting with the autonomous bots that auto-revert with an (aspired) 1:1000 false positive rate.
+The author cautions that these arguments have to be reinterpreted when applying them to the aforementioned vandalism profiling, because it consists of "procedural rules" (which edits should be selected for inspection) rather than "substantive rules" (which edits should be reverted as vandalism, which animals should be disallowed from the restaurant). While in the case of substantive rules, their absence would mean having to judge everything on a case-by-case basis, the author asserts that procedural rules arise in a situation where the alternative would be to to not judge at all in many cases: Because "we have no means at our disposal to check and pass judgment on all of them; a selection of a kind has to be made. So it is here that profiling comes in". With that qualification, Schauer's second argument provides justification for "Wikipedian profiling [because it] turns out to be amazingly effective", starting with the autonomous bots that auto-revert with an (aspired) 1:1000 false-positive rate.
-De Laat also interprets "the Schauerian argument of ''reliability/predictability'' for those affected by the rule" in favor of vandalism profiling. Here though, he fails to explain the benefits of vandals being able to predict which kind of edits will be subject to scrutiny. This also calls into question his subsequent remark that "it is unfortunate that the anti-vandalism system in use remains opaque to ordinary users". The remaining two of Schauer's four arguments are judged as less pertinent. But overall the paper concludes that it is possibile to justify the existence of vandalism profiling rules as beneficial via Schauer's theoretical framework.
+De Laat also interprets "the Schauerian argument of ''reliability/predictability'' for those affected by the rule" in favor of vandalism profiling. Here, though, he fails to explain the benefits of vandals being able to predict which kind of edits will be subject to scrutiny. This also calls into question his subsequent remark that "it is unfortunate that the anti-vandalism system in use remains opaque to ordinary users". The remaining two of Schauer's four arguments are judged as less pertinent. But overall the paper concludes that it is possibile to justify the existence of vandalism profiling rules as beneficial via Schauer's theoretical framework.
-Next, de Laat turns to question 2., on whether vandalism profiling is also morally justified. Here he relies on later work by Schauer, from a 2003 book titled "Profiles, Probabilities, and Stereotypes" that studies e.g. profiling as done by tax officials (selecting which taxpayers have to undergo an audit), airport security (selecting passengers for screening) and police officers (e.g. selecting cars for traffic stops). While profiling of some kind is a necessity for all these officials, the particular characteristics (dimensions) used for profiling can be highly problematic (see e.g. [[Driving While Black]]). For de Laat's study of Wikipedia profiling, "two types of complications are important: (1) possible ‘overuse’ of dimension(s) (an issue of profile effectiveness) and (2) social sensibilities associated with specific dimension(s) (a social and moral issue)." Overuse can mean relying on stereotypes that have no basis in reality, or over-reliance on some dimensions that, while having a non-spurious correlation with the deviant behavior, are over-emphasized at the expense of other relevant characteristics because they are more visible or salient to the profile. E.g. while Schauer considers that it may be justified for "airport officials looking for explosives [to] single out for inspection the luggage of younger Muslim men of Middle Eastern appearance",  it would be an over-use if "officials ask ''all'' Muslim men and ''all'' men of Middle Eastern origin to step out of line to be searched", thus reducing their effectiveness by neglecting other passenger characteristics. This is also an example for the second type of complication profiling, where the selected dimensions are socially sensitive - indeed, for the specific case of luggage screening in the US, "the factors of race, religion, ethnicity, nationality, and gender have expressly been excluded from profiling" since 1997.
+Next, de Laat turns to question 2, on whether vandalism profiling is also morally justified. Here he relies on later work by Schauer, from a 2003 book, "Profiles, Probabilities, and Stereotypes", that studies such matters as profiling by tax officials (selecting which taxpayers have to undergo an audit), airport security (selecting passengers for screening) and decision-making by police officers (e.g. selecting cars for traffic stops). While profiling of some kind is a necessity for all these officials, the particular characteristics (dimensions) used for profiling can be highly problematic (see e.g. [[Driving While Black]]). For de Laat's study of Wikipedia profiling, "two types of complications are important: (1) possible ‘overuse’ of dimension(s) (an issue of profile effectiveness) and (2) social sensibilities associated with specific dimension(s) (a social and moral issue)." Overuse can mean relying on stereotypes that have no basis in reality, or over-reliance on some dimensions that, while having a non-spurious correlation with the deviant behavior, are over-emphasized at the expense of other relevant characteristics because they are more visible or salient to the profile. E.g. while Schauer considers that it may be justified for "airport officials looking for explosives [to] single out for inspection the luggage of younger Muslim men of Middle Eastern appearance", it would be an over-use if "officials ask ''all'' Muslim men and ''all'' men of Middle Eastern origin to step out of line to be searched", thus reducing their effectiveness by neglecting other passenger characteristics. This is also an example for the second type of complication profiling, where the selected dimensions are socially sensitive – indeed, for the specific case of luggage screening in the US, "the factors of race, religion, ethnicity, nationality, and gender have expressly been excluded from profiling" since 1997.
-Applying this to the case of Wikipedia vandalism fighting, de Laat first observes that complication (1) (overuse) is not a concern for fully automated tools like ClueBotNG - obviously their algorithm applies the existing profile directly without a human intervention that could introduce this kind of bias. For Huggle and STiki, however, "I see several possibilities for features to be overused by patrollers, thereby spoiling the optimum efficacy achievable by the profile embedded in those tools." This is because both tools do not just use these features in their automatic pre-selection of edits to be reviewed, but also expose at least the fact whether an edit was anonymous to the human patroller in the edit review interface. (The paper examines this in detail for both tools, also observing that Huggle presents more opportunities for this kind of overuse, while STiki is more restricted. Howver there does not seem to have been an attempt to study empirically whether this overuse actually occurs.)
+Applying this to the case of Wikipedia's anti-vandalism efforts, de Laat first observes that complication (1) (overuse) is not a concern for fully automated tools like ClueBotNG – obviously their algorithm applies the existing profile directly without a human intervention that could introduce this kind of bias. For Huggle and STiki, however, "I see several possibilities for features to be overused by patrollers, thereby spoiling the optimum efficacy achievable by the profile embedded in those tools." This is because tools do not just use these features in their automatic pre-selection of edits to be reviewed, but expose at least the fact whether an edit was anonymous to the human patroller in the edit review interface. (The paper examines this in detail for both tools, also observing that Huggle presents more opportunities for this kind of overuse, while STiki is more restricted. However, there seem to have been no attempt to study empirically whether this overuse actually occurs.)
-Regarding complication (2), whether some of the features used for vandalism profiling are socially sensitive, de Laat highlights that they include some amount of discrimination by nationality:  IP edits geolocated to the USA, Canada, and Australia have been found to contain vandalism more frequently and are thus more likely to be singled out for inspection. However, he does not consider this concern "strong enough to warrant banning the country-dimension and correspondingly sacrifice some profiling efficacy", chiefly because there do not appear to be a lot of nationalistic tensions within the English Wikipedia community that could be stirred up by this.
+Regarding complication (2), whether some of the features used for vandalism profiling are socially sensitive, de Laat highlights that they include some amount of discrimination by nationality:  IP edits geolocated to the US, Canada, and Australia have been found to contain vandalism more frequently and are thus more likely to be singled out for inspection. However, he does not consider this concern "strong enough to warrant banning the country-dimension and correspondingly sacrifice some profiling efficacy", chiefly because there do not appear to be a lot of nationalistic tensions within the English Wikipedia community that could be stirred up by this.
-In contrast, de Laat argues that "the targeting of contributors who choose to remain ''anonymous'' [...] is fraught with danger since anons already constitute a controversial group within the Wikipedian community." Still, he acknowledges the "undisputed fact" that the ratio of vandalism is much higher among anonymous edits. Also, he rejects the concern that they might be more likely to be the victim of false positives: "normally [IP editors] do not experience any harm when their edits are selected and inspected as a result of anon-powered profiling; they will not even notice that they were surveilled since no digital traces remain of the patrolling. [...] The only imaginable harm is that patrollers become over focussed on anons and indulge in what I called above ‘overinspection’ of such edits and wrongly classify them as vandalism [...] As a consequence, they might never contribute to Wikipedia again.  [...] Nevertheless, I estimate this harm to be small. At any rate, the harm involved would seem to be small in comparison with the harassment of racial profiling—let alone that an ‘expressive harm hypothesis’ applies."
+In contrast, de Laat argues that "the targeting of contributors who choose to remain ''anonymous'' ... is fraught with danger since anons already constitute a controversial group within the Wikipedian community." Still, he acknowledges the "undisputed fact" that the ratio of vandalism is much higher among anonymous edits. Also, he rejects the concern that they might be more likely to be the victim of false positives:
+{{Signpost quote|normally [IP editors] do not experience any harm when their edits are selected and inspected as a result of anon-powered profiling; they will not even notice that they were surveilled since no digital traces remain of the patrolling. ... The only imaginable harm is that patrollers become over focussed on anons and indulge in what I called above 'overinspection' of such edits and wrongly classify them as vandalism ... As a consequence, they might never contribute to Wikipedia again.  ... Nevertheless, I estimate this harm to be small. At any rate, the harm involved would seem to be small in comparison with the harassment of racial profiling—let alone that an 'expressive harm hypothesis' applies.}}
-With this said, de Laat still makes the controversial call "that the anonymous-dimension should be banned from all profiling efforts" - including removing it from the scoring algorithms of Huggle, STiki and ClueBotNG. Instead of concerns about individual harm,
+With this said, de Laat still makes the controversial call "that the anonymous-dimension should be banned from all profiling efforts" – including removing it from the scoring algorithms of Huggle, STiki and ClueBotNG. Instead of concerns about individual harm,
-:''"my main argument for the ban is a decidedly moral one. From the very beginning the Wikipedian community has operated on the basis of a ‘social contract’ that makes no distinction between anons and non-anons—all are citizens of equal stature. [...] In sum, the express profiling of anons turns the anonymity dimension from an access condition into a social distinction; the Wikipedian community should refrain from institutionalizing such a line of division. Notice that I argue, in effect, that the Wikipedian community has only two choices: either accept anons as full citizens or not; but there is no morally defensible social contract in between."''
+:''"my main argument for the ban is a decidedly moral one. From the very beginning the Wikipedian community has operated on the basis of a 'social contract' that makes no distinction between anons and non-anons—all are citizens of equal stature. ... In sum, the express profiling of anons turns the anonymity dimension from an access condition into a social distinction; the Wikipedian community should refrain from institutionalizing such a line of division. Notice that I argue, in effect, that the Wikipedian community has only two choices: either accept anons as full citizens or not; but there is no morally defensible social contract in between."''
-Sadly, while the paper is otherwise rich in citations and details, it completely fails to provide evidence for the existence of this alleged contract. While it is true that "the ability of almost anyone to edit (most) articles without registration" forms part of Wikipedia's [[m:founding principles|founding principles]] (a principle that this reviewer strongly agrees with), the "equal stature" part seems to be de Laat's own invention - there is a [[Wikipedia:IPs_are_human_too#What_an_unregistered_user_can.27t_do_by_themselves_.28directly.29|long list]] of things that, by longstanding community consensus, require the use of an account (which after all is freely available to everyone, without even requiring an email address). Most of these restrictions - say, the inability to create new articles or being prevented from participating in project governance during admin or arbcom votes - seem much more serious than the vandalism profiling that is the topic of de Laat's paper.
+While the paper is otherwise rich in citations and details, it completely fails to provide evidence for the existence of this alleged contract. While it is true that "the ability of almost anyone to edit (most) articles without registration" forms part of Wikipedia's [[m:founding principles|founding principles]] (a principle that this reviewer strongly agrees with), the "equal stature" part seems to be de Laat's own invention – there is a [[Wikipedia:IPs_are_human_too#What_an_unregistered_user_can.27t_do_by_themselves_.28directly.29|long list]] of things that, by longstanding community consensus, require the use of an account (which after all is freely available to everyone, without even requiring an email address). Most of these restrictions – say, the inability to create new articles or being prevented from participating in project governance during admin or arbcom votes – seem much more serious than the vandalism profiling that is the topic of de Laat's paper.
 ===Briefly===
@@ Line 69: / Line 71: @@
 ===Other recent publications===
 ''A list of other recent publications that could not be covered in time for this issue—[[m:Research:Newsletter#How to contribute|contributions are always welcome]] for reviewing or summarizing newly published research.''
-*'''"Large SMT Data-sets Extracted from Wikipedia"'''<ref>{{Cite conference| isbn = 978-2-9517408-8-4| conference = TUFI 14.103| last1 = Tufiş| first1 = Dan| last2 = Ion| first2 = Radu| last3 = Dumitrescu| first3 = Ştefan| last4 = Ştefănescu2| first4 = Dan| title = Large SMT Data-sets Extracted from Wikipedia| booktitle = Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)| date = 2014-05-26| url = http://www.lrec-conf.org/proceedings/lrec2014/pdf/103_Paper.pdf}}</ref> From the abstract: "The article presents experiments on mining Wikipedia for extracting SMT [ [[statistical machine translation]] ] useful sentence pairs in three language pairs. [...] The optimized SMT systems were evaluated on unseen test-sets also extracted from Wikipedia. As one of the main goals of our work was to help Wikipedia contributors to translate (with as little post editing as possible) new articles from major languages into less resourced languages and vice-versa, we call this type of translation experiments 'in-genre' translation. As in the case of 'in-domain' translation, our evaluations showed that using only 'in-genre' training data for translating same genre new texts is better than mixing the training data with 'out-of-genre' (even) parallel texts."
+*'''"Large SMT Data-sets Extracted from Wikipedia"'''<ref>{{Cite conference| isbn = 978-2-9517408-8-4| conference = TUFI 14.103| last1 = Tufiş| first1 = Dan| last2 = Ion| first2 = Radu| last3 = Dumitrescu| first3 = Ştefan| last4 = Ştefănescu2| first4 = Dan| title = Large SMT Data-sets Extracted from Wikipedia| booktitle = Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)| date = 26 May 2014| url = http://www.lrec-conf.org/proceedings/lrec2014/pdf/103_Paper.pdf}}</ref> From the abstract: "The article presents experiments on mining Wikipedia for extracting SMT [ [[statistical machine translation]] ] useful sentence pairs in three language pairs. [...] The optimized SMT systems were evaluated on unseen test-sets also extracted from Wikipedia. As one of the main goals of our work was to help Wikipedia contributors to translate (with as little post editing as possible) new articles from major languages into less resourced languages and vice-versa, we call this type of translation experiments 'in-genre' translation. As in the case of 'in-domain' translation, our evaluations showed that using only 'in-genre' training data for translating same genre new texts is better than mixing the training data with 'out-of-genre' (even) parallel texts."
-*'''"Recognizing Biographical Sections in Wikipedia"'''<ref>{{Cite conference| pages = 811-816| last1 = Aprosio| first1 = Alessio Palmero| last2 = Tonelli| first2 = Sara| title = Recognizing Biographical Sections in Wikipedia| booktitle = Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing| location = Lisbon, Portugal| date = 2015-09-17| url = http://www.aclweb.org/anthology/D15-1095}}</ref> From the abstract: "Thanks to its coverage and its availability in machine-readable format, [Wikipedia] has become a primary resource for large scale research in historical and cultural studies. In this work, we focus on the subset of pages describing persons, and we investigate the task of recognizing biographical sections from them: given a person’s page, we identify the list of sections where information about her/his life is present [as opposed to nonbiographical sections, e.g. 'Early Life' but not 'Legacy' or 'Selected writings']."
+*'''"Recognizing Biographical Sections in Wikipedia"'''<ref>{{Cite conference| pages = 811–816| last1 = Aprosio| first1 = Alessio Palmero| last2 = Tonelli| first2 = Sara| title = Recognizing Biographical Sections in Wikipedia| booktitle = Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing| location = Lisbon, Portugal| date = 17 September 2015| url = http://www.aclweb.org/anthology/D15-1095}}</ref> From the abstract: "Thanks to its coverage and its availability in machine-readable format, [Wikipedia] has become a primary resource for large scale research in historical and cultural studies. In this work, we focus on the subset of pages describing persons, and we investigate the task of recognizing biographical sections from them: given a person’s page, we identify the list of sections where information about her/his life is present [as opposed to nonbiographical sections, e.g. 'Early Life' but not 'Legacy' or 'Selected writings']."
-*'''"'A Spousal Relation Begins with a Deletion of engage and Ends with an Addition of divorce': Learning State Changing Verbs from Wikipedia Revision History."''' <ref>{{Cite conference| conference = Proceedings of EMNLP 2015.| pages = 518-523| last1 = Nakashole| first1 = Ndapa| last2 = Mitchell| first2 = Tom| last3 = Wijaya| first3 = Derry| title = "A Spousal Relation Begins with a Deletion of engage and Ends with an Addition of divorce": Learning State Changing Verbs from Wikipedia Revision History.| location = Lisbon, Portugal| date = 2015| url = http://www.emnlp2015.org/proceedings/EMNLP/pdf/EMNLP059.pdf}}</ref> From the abstract: "We propose to learn state changing verbs [such as 'born', 'died', 'elected', 'married'] from Wikipedia edit history. When a state-changing event, such as a marriage or death, happens to an entity, the infobox on the entity’s Wikipedia page usually gets updated. At the same time, the article text may be updated with verbs either being added or deleted to reflect the changes made to the infobox. [...] We observe in our experiments that when state-changing verbs are added or deleted from an entity’s Wikipedia page text, we can predict the entity’s infobox updates with 88% precision and 76% recall." <small>[[User:Tbayer (WMF)|TB]]</small>
+*'''"'A Spousal Relation Begins with a Deletion of engage and Ends with an Addition of divorce': Learning State Changing Verbs from Wikipedia Revision History."'''<ref>{{Cite conference| conference = Proceedings of EMNLP 2015.| pages = 518–523| last1 = Nakashole| first1 = Ndapa| last2 = Mitchell| first2 = Tom| last3 = Wijaya| first3 = Derry| title = "A Spousal Relation Begins with a Deletion of engage and Ends with an Addition of divorce": Learning State Changing Verbs from Wikipedia Revision History.| location = Lisbon, Portugal| date = 2015| url = http://www.emnlp2015.org/proceedings/EMNLP/pdf/EMNLP059.pdf}}</ref> From the abstract: "We propose to learn state changing verbs [such as 'born', 'died', 'elected', 'married'] from Wikipedia edit history. When a state-changing event, such as a marriage or death, happens to an entity, the infobox on the entity's Wikipedia page usually gets updated. At the same time, the article text may be updated with verbs either being added or deleted to reflect the changes made to the infobox. [...] We observe in our experiments that when state-changing verbs are added or deleted from an entity's Wikipedia page text, we can predict the entity's infobox updates with 88% precision and 76% recall." <small>[[User:Tbayer (WMF)|TB]]</small>
 *'''"Extracting Representative Phrases from Wikipedia Article Sections"'''<ref>Shan Liu, Mizuho Iwaihara:
 Extracting Representative Phrases from Wikipedia Article Sections, DEIM Forum 2016 C3-6. http://db-event.jpn.org/deim2016/papers/314.pdf</ref> From the abstract: "Since [Wikipedia's] long articles are taking time to read, as well as section titles are sometimes too short to capture comprehensive summarization, we aim at extracting informative phrases that readers can refer to."