Wikipedia:Wikipedia Signpost/2015-02-25/Recent research
Article display preview: | This is a draft of a potential Signpost article, and should not be interpreted as a finished piece. Its content is subject to review by the editorial team and ultimately by JPxG, the editor in chief. Please do not link to this draft as it is unfinished and the URL will change upon publication. If you would like to contribute and are familiar with the requirements of a Signpost article, feel free to be bold in making improvements!
|
(Your article's descriptive subtitle here)
A monthly overview of recent academic research about Wikipedia and other Wikimedia projects, also published as the Wikimedia Research Newsletter.
First Women, Second Sex: Gender Bias in Wikipedia
- by Maximilianklein (talk)
The problem of the Gender Gap in Wikipedia can mean several things; a gap in editors, or a gap in the content, and of course the relation between the two. "First Women, Second Sex: Gender Bias in Wikipedia" [1] addresses the gap from the content-side, with justification by many Simone de Beauvoir quotes. The authors use an ensemble of three methods - DBPedia metadata, language modelling, and network theroy - to shown not just inequality in encyclopedia inclusion, but degrees of sexism in how biographies are included. For instance, how different genders meet notability is quantifiably different, as is the centrality of biographies in their link structure.
The initial metadata technique is an inspection of DBPedia data mashed up with a separate dataset from previous research based on pronoun counting techniques. This method is a bit shaky as it relies on the combination of two derived datasets, especially in an era when Wikidata can deliver data closer to the source. Nevertheless the researchers find that 15.5% of their final dataset are Women biographies, and this corroborates 16.1% and 15.6% estimations by alternative techniques[2]. Digging further, biographies are separated by subclass: athletes, politicians, military-personnel, and all othersare more heavily male - only artists and royalty are female-biased. Other findings from this type of infobox scraping is that female biographies are much more likely to have the spouse parameter filled.
Moving into the natural language realm, the paper inspects bigrams of the biographies' text. The top words associated with men are "played", "football" and "league"; for women, the top are "actress", "women's" and "her husband". This already starts to hint at the notion that men are notable for what they do, rather than only their static characteristics. To further investigate Linguistic Inquiry and Word Count (LIWC) and two measures - frequency and burstiness - are employed for semantic classification. The semantic category where male biographies score significantly higher is cognitive mechanics, which encompasses words like "became", "known", and "made"; meanwhile female biographies have significantly more sexual words like "love", "passion", and "sex".
The last domain explored is network structure. Each biography links-to, and is linked-from other biographies, and thus make a directed graph. The first interesting thing to note is that in chi-squared testing between 4 link types (female-female, female-male, male-male, male-female), only female-female occur more than expected. Next a PageRank ranking is made of the graph, which determines the importance or "centrality" of biographies. Any subsetting of biographies by removing the least PageRanked articles, it is found, reduces the female ratio of the subset below the total figure.
The authors wrap up their conclusions within the context of feminist theory. They argue the notion of gender roles is evident in Wikipedia in the way that metadata shows that men are more often known to be sportspeople, and women to be artists, royalty or spouses of someone else. Likewise the language of biographies is biased. That "her husband" and "first woman" are top terms in female articles shows a failure in the Finkbeiner test. Furthermore the authors claim this exhibits "objectification" in light of the evidence that the "cognitive processes" of men where shown to be more significant than women, and that the "sexual" category is the only one in which women are more frequently described than men. Finally, as viewed from the network structure results, female biographies are less central to the encyclopedia. This is said to be because of historical philosophy and today's notability guidelines, that "reason and objectivity are gendered male" - a feminist metaphysical view. The explanation of female articles tending to link to other female articles more than expected, the authors imagine is due to women-led gender gap addressing efforts.
Overall this article provides a wide variety of methods to measure the gender gap, which proves a high-level point from many perspectives. It is situated in feminist thought, but multiple returns to Beauvoir make the final analysis seem surface and general. Additionally the simplifying assumptions of English-only and derived datasets leave open the criticism that the larger points cannot be disentangled from a few extra biases introduced by language- and processing-inherited lenses. The authors admit as much in their limitations when they also acknowledge not questioning the gender binary either. What we have here though is an increment to a growing pile of methods and techniques proving the gender gap which, ideologically, does not need, but can always benefit from additional statistical legitimacy.
Wikipedia’s SOPA Strike considered as international political movement
A paper[3] written by prolific Wikipedian Piotr Konieczny revisits the SOPA Strike. This was a 24-hour blackout of the English Wikipedia in 2012 to protest against proposed American copyright legislation, accompanied by tools for citizens to contact their representatives on the issue. The author argues this event demonstrates a new political opportunity structure for international movements, such as the free culture movement, to influence national policies.A chronology of the events leading up to the SOPA Strike on Wikipedia is presented. The author then analyzes Wikipedia’s forums debating whether and how to restrict access to the site for a day. Debate participants are classified by such characteristics as national origin, history of editing Wikipedia, and stated arguments for and against, and simple quantitative analyses of population percentages and relative contribution are performed. Konieczny then tests various hypotheses about the nature of the protest, to see which one fits the facts.
Konieczny concludes that experienced Wikipedians were generally supportive of a protest but were more likely to express misgivings about losing neutrality. Americans also participated in a greater proportion than their prevalence on the English Wikipedia. However the process also allowed non-US citizens and free culture idealists to have significant leverage over the debate on Wikipedia, and thus on American national politics. Konieczny tries to show that Wikipedia is thus an international social movement in the broader free culture movement. Konieczny ends the paper with a speculation that the many pro-blackout single-purpose accounts may reflect a new political consciousness among the young and internet-savvy.
Konieczny's analysis gives us a very detailed, fascinating picture of what arguments were made in public on Wikipedia forums during a crucial few weeks. However, this may omit some of the most influential discussions, by insiders, taking place person-to-person and in chat rooms. The paper also omits discussion of the influence of the Wikimedia Foundation, as an American institution responding to a American legal threat.
When Konieczny asserts the existence of a rising transnational "Net Generation", he's presented very little evidence. A skeptical or quietist Wikipedian might still conclude that the encyclopedia wasn't acting as an organ of democracy, but was briefly overrun by a Twitter trending topic. If Konieczny is right, we may see other internet-based communities also being pressed into service, or more permanent institutions being developed to serve this new community.
Full disclosure: I (NeilK) was intimately involved with the SOPA Strike movement on Wikipedia, as a technologist on the WMF staff, and as a concerned Wikipedian who weighed in on the very forums analyzed in this paper, in favor of a blackout.
Briefly
- ...:
- ...:
Other recent publications
A list of other recent publications that could not be covered in time for this issue – contributions are always welcome for reviewing or summarizing newly published research.
- "..."
- "..."
References
- ^ Graells-Garrido, Eduardo (2015-02-08). "First Women, Second Sex: Gender Bias in Wikipedia". arXiv:1502.02341 [cs].
{{cite journal}}
: Unknown parameter|coauthors=
ignored (|author=
suggested) (help) - ^ "Wiki Research Mailing List Discussion".
- ^ Konieczny, Piotr (2014-09-29). "The day Wikipedia stood still: Wikipedia's editors' participation in the 2012 anti-SOPA protests as a case study of online organization empowering international and national political opportunity structures". Current Sociology. I (23): 77–93. doi:10.1177/0011392114551649. Retrieved 2015-02-25.
- Supplementary references and notes:
Discuss this story
Great article! I especially enjoyed reading the one about the university students :) -Newyorkadam (talk) 04:50, 27 February 2015 (UTC)Newyorkadam[reply]
I doubt that the aforementioned Finkbeiner test is an objective test to be reckoned with. Considering that the science is still dominated by men, indicating that a particular scientist is a woman is important. Interestingly enough, the test was conceived by a woman who apparently is not happy to be called a woman and proposes to erase such mentions from historical annals, which is puzzling. Brandmeistertalk 09:56, 27 February 2015 (UTC)[reply]
Loved the students' (correct) response too.
The geo-data by Oliver Keyes mentioned is not accessible at the moment, seems the account has used up its data allocation. It would be nice to be able to see the stuff somewhere else? Chiswick Chap (talk) 12:45, 27 February 2015 (UTC)[reply]
1. I would love to see computer algorithm values for article quality! We already have a readability of Wikipedia link, but this sounds more like a utility value? The project "grades" seem rarely changed, and too broad to be of much use.
2. The "student vandalism project" was always going to fail - they tried making really blatant edits. The problem is more generally with "partial wrongful edits" where the edit does not instantly jump out on a watchlist as being horrid -- but more subtle in tone. The teacher should have simply told them to modify refs to make them lead to completely different topics -- which would be a far better test of how we actually find problematic editors IMHO. Ref changes do not get caught as well as they ought. Collect (talk) 13:51, 27 February 2015 (UTC)[reply]
Clubot passed the Turing test. :) Oiyarbepsy (talk) 15:52, 27 February 2015 (UTC)[reply]