"You wouldn't recognise a fact if it bit you in the ass"; "eat your 'fucking' crow"; "[you are] an ignorant idiot"; "If you get testicle cancer or become a transsexual, then estrogen ... could enlarge and improve the mammary function of your breasts."; "are you a pedophile?"
"I'm sorry if that's considered a personal attack, but it's just true."
In the impersonal, detached Colosseum that is Wikipedia, people find it much easier to put their thumbs down. As such, many people active in the Wikimedia movement have witnessed a precipitous decline in civil discourse. This is far from a new trend, yet many people would agree that it all seemed somehow worse in 2012.
On the English Wikipedia, this is most often witnessed on the administrators' noticeboards, but the decline was perhaps most visible in the featured article process, where the various talk pages were disrupted with personal disputes, sockpuppetry, and gladiatorial nastiness. These attitudes have been increasingly evident in many corners of our encyclopedia.
Some people have talked of a new-year détente between the warring parties. While this could result in greatly reduced tension—assuming everyone involved agreed, which they have not—new disputes arise every day; détente alone will not solve the problem. Yet there is still resistance from editors: for example, there is a certain attitude that the quality of Wikipedia is low, and editors need to be kicked into improving it with harsh language.
Those attitudes should be rejected. Have we not tried that for the last several years? Do any editors believe it has worked? We have to come together and improve the health of our community. Even Rome eventually found that gladiatorial fights were detrimental to their society.
Wikipedia is what we, the community, make of it.
Take it upon yourself in this new year to make it a better place.
The Signpost's volunteers wish all of our readers a Happy New Year. We hope 2013 brings everything you wish of it.
Brion Vibber has been a Wikipedia editor for nearly 11 years and was the first person officially hired to work for the Wikimedia Foundation. He was instrumental in early development of the MediaWiki software and is now the lead software architect for the foundation's mobile development team.
Brion Vibber, 2008
Avar and Brion at Wikimedia Conference Berlin 2009
Brion speaks at Wikimania 2011
Brion, how did you end up becoming the Wikimedia Foundation's first employee?
I'd been working on the Wikipedia software – what eventually became MediaWiki – since slightly before the first deployment around late 2001 and early 2002. Though I started with localization and Unicode support, I expanded until I was the primary maintainer by the time the name MediaWiki rolled around. :)
For a couple years this was just volunteer development while I was in film school. Eventually, around the end of 2004, Jimmy Wales approached me about contracting half-time to make sure I could continue to work on wiki stuff. At that stage, I was working an on-campus job – on the Southern California Earthquake Center's Electronic Encyclopedia of Earthquakes – but not really taking classes any more, since I was mostly doing web development.
Back then, Wikimedia's entire annual budget was minuscule and I was worried about my pay breaking the bank – Jimmy assured me that part of what I'd be working on was content-feed support for some third-party indexers, which would directly pay for my wages; so I accepted. Half-time quickly became full-time, and I left the earthquake project for Wikimedia.
I think I still have a few credits left to complete before I can get that film degree, but it wouldn't help me much in the software development world I've ended up in. :)
What was the first year like and how did you manage to juggle all of the tasks that had to be taken care of?
The first year of paid work (2005–2006ish) involved a lot of running around and poking at everything that moved. We had very few servers and just a few volunteer server administrators.
What do you miss most about the old days?
Things were a lot more rough-and-tumble, but they were also sometimes quicker. We could bang out a cool feature and deploy it immediately ... Now, we're working at a relatively fast-paced release schedule on the mobile interface, which is one of the things I like about working on those projects.
As a lot of people know, you're somewhat of a linguaphile. Did this have a significant influence on the development of MediaWiki's multi-lingual support?
I'm unfortunately not very fluent in many languages, but I've always loved learning about languages. I took some German and Latin in high school, French in university, and then for fun taught myself Esperanto – a constructed language with over a century of history – and a little Japanese.
It was actually through Esperanto that I discovered Wikipedia – other posters on an Esperanto-language newsgroup pointed out the Esperanto Wikipedia, looking for volunteers to help build it.
Back then Wikipedia ran on the UseModWiki engine, a Perl-based program which had very limited multilingual support – and at the time no Unicode support at all.
I noticed that for Esperanto – which has a few optional "funny characters" – we were unable to use the proper accents and had to make do with an ASCII notation where, for instance, "sx" stood for "ŝ". So my first contribution was to make it so the characters were automatically converted for proper display, even though we still used the X-notation for editing.
Next was full Unicode (UTF-8) encoding support. UseModWiki's limitation to the Latin-1 8-bit character set made languages like Russian, Hebrew, and Japanese completely unusable – we had wikis for them, but nobody could actually write text on them. Polish Wikipedians had gone as far as to set up their own offsite wiki, reconfigured for Latin-2, so they could work until something got fixed. Once we rolled out Magnus Manske's PHP-based software (known later as "phase 2"), I added support for UTF-8 so we could switch those languages over and actually type in them.
Phase 2 was an awesome proof of concept but had major performance problems once it was deployed under real load, so it soon got rewritten by Lee Daniel Crocker into "phase 3", which developed into the MediaWiki we know today. Unfortunately Lee had forgotten to adapt my Unicode and localization work, so I had to write it (as well as the Esperanto X-translation) again for phase 3. ;)
A lot of the more recent localization work has been done by other people, but I jump in from time to time to help; some of the familiar old problems with missing fonts and keymaps – especially for Indic languages – reminds me a lot of the work I did back then!
A lot of your work recently has focused on better mobile support for MediaWiki. Can you tell us a bit about what you're up to on the mobile front?
We're working on three main areas in mobile:
making the website awesome in mobile browsers both for readers and contributors
providing a great app for popular smartphone and tablet devices, with additional offline capabilities
building action-driven apps for things like Commons photo uploads.
We've got a pretty good reader experience on mobile browsers now, and we're starting to work on more contributor-focused features. A basic mobile-friendly watchlist view is in early beta testing, and we've developed some experimental editing and photo upload features.
I strongly recommend an interview with Jon Robson sometime for more details on the mobile web – he's gone so far as to "dogfood" the mobile site in his desktop browser, hence adding basic editing just so he could use the mobile site all the time.
In the mobile space it's also important to have a presence in app stores, both for discoverability and because there are still things that a web page can't do by itself. We're just starting to get camera access in the browser, but offline storage, background notifications, and hooking into inter-application frameworks like the "share" feature on Android still require us to develop apps.
We've been using HTML5 technologies extended by a PhoneGap/Cordova container, but are starting to look at more "native" app development – using Objective-C for iOS and Java for Android – for better performance, easier system integration, and better compatibility with older versions of Android out in the wild.
Yuvi Panda has been doing a lot of the initial Android native work; he's worked on an app for reading the Signpost – with notifications for new issues – and a Commons uploader app that allows you to trigger it directly from the camera or gallery app.
What other interesting projects have you worked on lately?
A couple of my pet projects from last year are getting closer to fruition. The CodeEditor extension, which embeds a syntax-highlighting editor for JS and CSS pages, has been adapted for editing the new Lua templating engine. I'm also working on updating the SVGEdit extension, which embeds a web-based vector graphics editor, allowing you to edit SVG graphics directly on-wiki. There's been a lot of new upstream activity on the editor thanks to a push from TikiWiki's Marc Laporte and others, so I'm getting excited about it again. The first thing I have to do is make a test suite to ensure that it won't damage existing files when trying to edit them.
Finally, I've been socializing the notion of MP4 (H.264/AAC) video and audio transcoding output, which will allow us to serve video and audio to Macs, Windows 7/8 PCs, iOS, and other desktop and mobile platforms without additional local software installation. There are some ideological issues with even partial support of a patent-encumbered format, but we'll see how it goes – our goal is to get information out to people, after all.
What are your thoughts about the new Wikidata project? Do you think you'll have an interest in contributing to it?
I'm super-glad it's happening; I'm also glad I don't have to worry about it because a bunch of smart folks are working on it.
What other projects besides English Wikipedia have you contributed to?
In the early days I contributed a lot of edits to the Esperanto Wikipedia as well, but I've got behind on that project. These days I mostly poke at documentation on MediaWiki.org, and do the occasional copyedit or fix when I notice something is awry.
If you had a robot copy of yourself, what would you most want it to work on?
Modernizing the desktop interface to be as awesome as we're starting to make the mobile one ... there are lots of great projects going on including the VisualEditor and the new Echo notifications and Flow notification/discussion/misc interface, but I just don't have time to work on them all.
What do you see as the biggest challenge facing the Wikimedia Foundation?
Figuring out how to balance "creation" and "maintenance" modes in our community, our user interfaces, and the projects we spend money or do outreach on. On the one hand we're trying to get more people to edit, on the other hand we want stable, quality materials. We tend to scare away newbies with the second hand.
I'd also love to see an endowment fund to give us more secure long-term income. I'm so glad that we're ad-free, but the donor-drive model is hard to predict and I worry about sustainability as core projects become more "complete" in major languages.
Wikimedia Foundation concludes fundraising campaign
Brightly colored banners detail the role of the Wikimedia Foundation in maintaining Wikipedia
On 27 December the Wikimedia Foundation (WMF) announced the conclusion of their ninth annual fundraiser, which attracted more than 1.2 million donors. The appeal reached its goal of US$25 million, even though fundraising banners ran for only nine days.
This year's campaign emphasized facts about the WMF, featuring lines of text in brightly colored banners. Previous campaigns used personal appeals from Wikipedia editors and co-founder Jimmy Wales. While the WMF considers the fundraiser successful, some editors expressed concerns about the invasive banners and what they considered was the low amount of cash raised. The remainder of the WMF's funding will come from grants and donations given outside the annual campaign.
The WMF's Executive Director Sue Gardner said, "I'm grateful that the Wikipedia fundraiser was so successful. Our supporters are wonderful and without them we could not do the job of delivering free content worldwide. We're thrilled to be able to introduce our readers to the editors around the world who create Wikipedia and to invite our readers to join in editing." A thank-you banner campaign began running on Wikipedia this week for viewers in the US, Canada, Britain, Australia and New Zealand—the five countries targeted in this campaign.
Czech Parliament releases photos of senators to chapter
The Parliament of the Czech Republic has released a total of 23 images under free licenses to the local chapter, Wikimedia Czech Republic, marking the beginning of a significant relationship. The images depict newly elected senators in the Czech parliament and will be used in biographical articles on Wikipedia projects. The images are now available on the Wikimedia Commons in this category under the Creative Commons Attribution 3.0 license.
A representative from Wikimedia Czech Republic contacted the public relations department at the Czech Parliament. Their letter detailed the copyright requirements and impact of Wikipedia. The office could not release photographs of all 81 senators due to the copyright requirements of older images, but the chapter was assured that future photographs would be released for use in Wikipedia. Wikimedia Czech Republic announced the accomplishment with a blog post in both Czech and English on December 28.
Personal data of editors and a bot subpoenaed: Neumont University is seeking the personal information of all registered editors who made changes to the article on the university, which includes the information about at least one robot, a possible first in a subpoena. The information was subpoenaed for use in a court case against a third party. However, except for users with advanced permissions, the foundation does not possess the addresses, full names, and other personal information of Wikipedia editors.
Affiliations Committee seeks volunteer candidates: The Wikimedia Affiliations Committee has published a call for 2013 candidates and is looking for about six members. The committee is responsible for guiding volunteers in establishing chapters, user groups, and thematic organizations, and for approving them.
Commons plans 2012 Picture of the Year contest: The Wikimedia Commons is planning its seventh annual Picture of the Year contest and has announced that Round 1 will begin on January 16, 2013. The planning committee is seeking translators.
Milestones: Wikidata, the centralized hub for organizing data on Wikimedia projects, reached two million entries this week. At this time, the entries list only the versions of Wikipedia articles in different languages.
English Wikipedia notes:
Quarterly update: The quarterly update, compiling the past three months of modifications to content policy pages, is complete.
Wikipedia's 12th birthday: Wikipedia will turn 12 on 15 January 2013 after being founded by Jimmy Wales and Larry Sanger on that date in 2001.
The Register swings at the Wikimedia movement's finances, and misses
It's that time of year again. As the Christmas lights go up, Wikipedia's donation drive kicks off. Wikipedia claims that the donations are needed to keep the site online. Guilt-tripped journalists including Heather Brooke and Toby Young have contributed to Wikipedia in the belief that donations help fund operating costs. Students, who are already heavily in debt, are urged to donate in case Wikipedia "disappears".
But what Wikipedia doesn't tell us is that it is awash with cash—and raises far more money each year than it needs to keep operating.
The author of the piece, Andrew Orlowski, opened the piece strangely, bringing up an unrelated story from 2005 and several early missteps by the Wikimedia Foundation.
He paraded several examples to support his argument. He first targeted one of the Wikimedia chapters: "In the UK, the local chapter of WMF, Wikimedia Foundation UK [sic], admitted to racking up a bill of £1,335 on business cards, calling it 'a failure to make the most effective procurement choices'." Yet in this claim he confused chapters, which are independent, with the WMF. Worse, he wrongly attributed the quote to the chapter (it was a question from WMUK trustee Fæ). The actual chapter response states, "We do not believe this represents a failure to make an effective procurement choice, as alternative suppliers were sought, and a sensible decision was reached ... [but in the] future, we will ensure that business card purchases are more thoroughly researched."
Orlowski next questioned Wikimedia Germany's €18,000 funding for editors to attend and photograph concerts, along with €81,000 to photograph many politicians. This of course fails to note that freely licensed, professional photographs of government figures are rare outside the United States, whose federal government releases its photography into the public domain.
Last, Orlowski conflated Omidyar Network's $2 million donation (2009) with winning a seat on the WMF Board of Trustees. The trustee in question, Matt Halprin, was appointed on 24 August 2009, just one day before Omidyar's donation. However, Halprin has since left Omidyar and continued to serve as a trustee until last month, and there is no evidence of a 'donation for board seat' agreement.
This superficial journalism was a substitute for what could have been more valid and useful criticism of the movement, of which there are many such opportunities, such as the repeated delays in the development of the visual editor, which is viewed as essential for the continued health of Wikimedia projects. WMUK, too, has many areas that could be examined, such as the Gibraltar controversy (see Signpost coverage: "UK chapter rocked by Gibraltar scandal"). Bringing up only three examples, one from three years ago, another misquoted and extremely minor, and the third necessary to obtain high-quality photographs to headline Wikipedia articles, Orlowski missed a chance to offer real, constructive criticism on the WMF and its chapters.
Links alleged between Jimmy Wales and Kazakh government
The Telegraph and Daily Dot, among others, have alleged that there are many links between the WMF, Wikipedia co-founder Jimmy Wales, and Kazakhstan's government, which is for all intents and purposes a one-party non-democratic state.
The controversy began when the background behind Wales' first "Wikipedian of the Year", Rauan Kenzhekhanuly (see Signpostcoverage) was publicized. Before becoming the head of a non-profit organization, Wikibilim, he served in Kazakhstan's Russian embassy and as the Moscow Bureau chief for the National TV Agency, which is viewed as a Kazakh government propaganda outlet. Additionally, his organization is backed by Kazakhstan's sovereign oil wealth fund, which is run by Kazakh President Nursultan Nazarbayev's son-in-law.
Wales has vocally supported both Kenzhekhanuly and Wikibilim, and the WMF gave the latter a US$16,600 grant to hold a Wikimedia conference in Kazakhstan in April 2012.
Wales defended himself and the WMF, saying: "The Wikimedia Foundation has zero collaboration with the government of Kazakhstan. Wikibilim is a totally independent organization. And it is absolutely wrong to say that I am 'helping the Kazakh regime whitewash its image.' I am a firm and strong critic. At the same time, I'm excited by the work of volunteers, and I believe—very strongly—that an open and independent Wikipedia will be the death knell for tyranny in places like Kazakhstan."
Whatever Wales' culpability, there is an inherent problem in this awkward situation, as Eurasianet's Myles Smith points out:
As Kazakh-language development is a major policy goal of the Kazakh government, Kenzhekhanuly must know how much favor his project curries with the government, just as similar projects sponsored by USAID, OSF, and Chevron have. Whatever the intentions of Kenzhekhanuly's organization, or of Jimmy Wales' cheerleading, the reality is that an authoritarian system, particularly one as well financed as oil-rich Kazakhstan’s, has thus far choked the idealist dreams of the crowd-sourced openness revolution. Without the freedom to express opinions openly in all fora, the online medium may remain a reflection of discussions in the rest of society, not an exception to them.
Later comments on Wales' talk page, led by Andreas Kolbe, tried to forge a link between Wales and the Kazakh government through Wales' friendship with Tony Blair, the former prime minister of the United Kingdom and the head of a public relations firm that has been contracted by the Kazakh government in the past. Wales, saying that Kolbe's tenuous allegations were "weird and irrelevant", hatted the discussion and banned Kolbe from his talk page.
The Signpostmentioned the Kazakh Wikipedia developments in June 2011.
This article was retitled on 5 January, as the previous title was too strong and therefore did not accurately portray the position of the news coverage surrounding the event.
Top hundred articles, by views: Various media outlets, including the BBC and Infodocket, have run stories on the most-viewed articles on Wikimedia sites. Facebook was this year's most-viewed article on the English Wikipedia, but the BBC highlighted the differences in article views between languages. The full lists are available on the Toolserver. As the BBC acknowledged (but several other sources did not) a few of the results are highly suspect, but most seem to be accurate.
A monthly overview of recent academic research about Wikipedia and other Wikimedia projects, edited jointly with the Wikimedia Research Committee and republished as the Wikimedia Research Newsletter.
How Wikipedia deals with a mass shooting
Northeastern University researcher Brian Keegan analyzed the gathering of hundreds of Wikipedians to cover the Sandy Hook Elementary School shooting in the immediate aftermath of the tragedy. The findings are reported in a detailed blog post that was later republished by the Nieman Journalism Lab. Keegan observes that the Sandy Hook shooting article reached a length of 50Kb within 24 hours of its creation, making it the fastest growing article by length in the first day among recent articles covering mass shootings on the English-language Wikipedia. The analysis compares the Sandy Hook page with six similar articles from a list of 43 articles on shooting sprees in the US since 2007. Among the analyses described in the study, of particular interest is the dynamics of dedicated vs occasional contributors as the article reaches maturity: while in the first few hours contributions are evenly distributed with a majority of single-edit editors, after hour 3 or 4 a number of dedicated editors show up and "begin to take a vested interest in the article, which is manifest in the rapid centralization of the article". A plot of inter-edit time also shows the sustained frequency of revisions that these articles display days after their creation, with Sandy Hook averaging at about 1 edit/minute around 24 hours since its first revision. The notebook and social network data produced by the author for the analysis are available on his website. The Nieman Journalism Lab previously covered the role that Wikipedia is playing as a platform for collaborative journalism, and why its format outperforms Wikinews with an interview of Andrew Lih published in 2010. The early revision history of the Sandy Hook shooting article was also covered in a blog post by Oxford Internet Institute fellow Taha Yasseri, however with a focus on the coverage in different Wikipedia language editions.
Network positions and contributions to online public goods: the case of the Chinese Wikipedia
In a forthcoming paper in the Journal of Management Information Systems (presented earlier at HICSS '12), Xiaoquan (Michael) Zhang and Chong (Alex) Wang use a natural experiment to demonstrate that changes to the position of individuals within the editor network of a wiki modify their editing behavior. The data for this study came from the Chinese Wikipedia. In October 2005, the Chinese government suddenly blocked access to the Chinese Wikipedia from mainland China, creating an unanticipated decline in the editor population. As a result, the remaining editors found themselves in a new network structure and, the authors claim, any changes in editor behavior that ensued are likely effects of this discontinuous "shock" to the network. The paper defines each editor as a node (vertex) in the network and a tie (edge) between two editors is created whenever the editors edit the same page in the wiki. They then examine how changes to three aspects of individual editors' relative connectedness (centrality) to other editors within the network altered their subsequent patterns of contribution.
The main finding is that changes in the three kinds of editors' connectedness within the network result in differential changes to their editing behavior. First, an increase in the number of direct connections between one editor and the rest of the network (degree centrality) resulted in fewer edits by that editor, and more work on articles they created. Second, an increase in the overall proximity of an editor to the other members of the network (closeness centrality) resulted in fewer edits and less work on articles they created. Third, an increase in the extent to which an editor connected otherwise isolated groups in the network (betweenness centrality) resulted in more edits and more work by that editor on articles they created. Overall, these results imply that alterations to the network structure of a wiki can change both the quantity and quality of editor contributions. The researchers argue that their findings confirm the predictions of both network game theory and role theory; and that future research should try to analyze the character of the network ties created within platforms for large-scale online collaboration, to better understand how changes to network structure may alter collaborative practices and public goods creation.
Quality of pharmaceutical articles in the Spanish Wikipedia
In an online early version of an upcoming article in Atención Primaria, researchers at the Miguel Hernández University of Elche and the University of Alicante have benchmarked articles on pharmaceutical drugs in the Spanish Wikipedia against information available in a pharmaceutical database, Vademécum. A subset of the Vademécum corpus of 3,595 drugs was created using simple random sampling without replacement, consisting of 386 drugs. Of these, 171 (44%) had entries on the Spanish Wikipedia, which were then scrutinized along several dimensions in May 2012. Usage of the drug was correctly indicated in 155 (91%) of these articles, dosage in 26 (15%), and side-effects in 64 (37%), with only 15 articles (9%) scoring well in all of these dimensions. The researchers conclude that, while Wikipedia has a high potential to help with the dissemination of pharmaceutical knowledge, the Spanish-language edition does not currently live up to this potential. As a possible solution, they suggest the pharmaceutical community more actively participate in editing Wikipedia. The list of the drugs involved has not been made public, since a similar study is currently underway whose results may be distorted by targeted intervention. The authors have signalled to this research report their intention to make the list available after this second study is complete.
Wikipedia editing patterns are consistent with a non-finite state model of computation
A paper posted to ArXiv by SFI's Omidyar fellow Simon DeDeo presents evidence for non-finite state computation in a human social system using data from Wikipedia edit histories. Finite state-systems are the basis for the study of formal languages in computer science and linguistics, and many real-world complex phenomena in biology and the social sciences are also studied empirically by assuming the existence of underlying finite-state processes, for the analysis of which powerful probabilistic methods have been devised. However, the question of whether the description of a system truly entails a finite or a non-finite, unbounded number of states, is an open one. This is significant from a functionalist point of view: can we classify a system by its computational properties, and can these properties help us better understand how the system works regardless of its material details?
The paper's contribution lies in its proof of a probabilistic generalization of the pumping lemma, a device used in theoretical computer science as a necessary condition for a language to be described by only a finite number of states. The lemma is applied to the edit histories of a number of the most frequently edited articles in the English Wikipedia, after being properly transformed into coarse-grain sequences of "cooperative" or "non-cooperative (reversion) edits (reverts being identified by means of their SHA1 field). A Bayesian argument is applied to show that the lemma cannot hold for a majority of sequences, thus showing that Wikipedia's collaborative editing system as a whole cannot be described by any aggregation of finite-state systems. The author discusses the implications of this finding for a more grounded study of Wikipedia's editing model, and for the identification of detailed computational models of other social and biological systems.
A full chapter is dedicated to the background on the concept of collective memory and its appearance in the digital world. The thesis continues with an analysis of "anniversary edits", showing a significant increase in editorial activities on articles related to traumatic events during the anniversary period compared to a large random sample of "other" articles. More detailed linguistic indicators are introduced in the next chapter. It is statistically shown that the terms related to affective processes, negative emotions, and cognitive and social processes occur more often in articles on traumatic events; "Specifically, the relative number of words expressing anxiety (e.g., “worried”), anger (e.g., “hate”) and sadness (e.g., “cry”) was significantly higher in articles about traumatic events".
In the next step, Ferron tried to distinguish between human-made and natural disasters. It has been observed that "human-made traumatic events were characterized by language referring to anger and anxiety, while the collective representation of natural disasters expressed more sadness". Finally, a detailed case study of the talk pages of articles on the 7 July 2005 London bombings and the 2011 Egyptian revolution was carried out, and language indicators, especially those related to emotions, were investigated in a dynamic framework and compared for both examples.
The English Wikipedia landing page, symbolically its only page during the blackout on January 18, 2012
The paper provides an interesting discussion of legitimacy in Wikipedia's governance, and discusses the legitimacy of the decision to participate in the protests. The author notes that the initiative was given a major boost by Jimmy Wales' charismatic authority, as Wales posted a straw poll about the issue on his talk page on December 10, 2011, as while the issue was discussed by the community beforehand (for example, in mid-November at the Village Pump), those discussions attracted much less attention. It is hard to say whether the protest would have happened without Jimbo's push for more discussion, as it veers towards "what if" territory; as things happened, it is true that Jimbo's actions began a landslide that led to the protests. However, this reviewer is more puzzled at the claim made in the introduction to the article that the discussion involved a "massive involvement of the Wikimedia Foundation staff". While several WMF staffers were active in the discussions in their official capacity, and while the WMF did issue some official statements about the ongoing discussion, the paper certainly does not provide any evidence to justify the word "massive".
The paper subsequently notes that the WMF focused on providing information and gently steering the discussion, without any coercion; this hardly justifies the claim of "massive involvement". At the very least, a clear explanation is necessary of precisely how many WMF staffers participated in the discussion before such a grandiose adjective as "massive" is used. It is true that the WMF staffers helped push the discussion forward, but this reviewer believes that the paper does not sufficiently justify the stress it puts on their participation, and thus may overestimate their influence.
The third part of the paper discusses how the arguments about legitimacy or the lack of it framed the subsequent discourse of the voters. The author notes that after initial period of discussing SOPA itself, the discussion of whether it was legitimate or not for Wikipedia to become involved in the protest took over, with a major justification for it emerging in the form of an argument that it was legitimate for Wikipedia to protest against SOPA as SOPA threatened Wikipedia itself. While this is an interesting claim, unfortunately, other than citing one single comment, no other qualitative or quantitative data are provided; nor is the methodology discussed. We are not told how many individuals voted, how many commented on legitimacy or illegitimacy, how many felt that Wikipedia is threatened; we do not know how the author classified comments supporting any of the viewpoints, or the shifts in the discussion ... this list could unfortunately go on. In one specific example drawn from the conclusion, the author writes that "The main factor that shaped the multi-phased process was the will to have the community accept the final decision as legitimate, and avoid backlash. This factor especially influenced those who are suspected of relying on traditional means of legitimacy such as charisma or professionalism." At the same time, we are provided with no number, no percentage, and certainly no correlation to back up this claim. Without a clear methodology or distinct data it is hard to verify the author's claims and conclusions.
The introduction also notes that "the mass effort of planning an effective political action was not something “anyone [could] edit”" and "the debate preceding the blackout did not follow Wikipedia’s open and anarchic decision-making system"; unfortunately this reviewer finds no justification for those rather strong claims anywhere else in the article.
Overall, this is an interesting paper about legitimacy in Wikipedia, but it seems to overreach when it tries to draw conclusions from the data that is simply not presented to the reader. It suffers from a failure to explain the research's methodology, making verification of the claims made very hard. Due to the lack of hard data, most conclusions are unfortunately rendered dubious, and the paper has a tendency to make strong claims that are not backed up by data or even developed later on.
Bots and collective intelligence explored in dissertation
Rats (blue trace) interacting with a rat-sized robot (red) controlled by a human who in turn perceives the rat's movements through those of a human-sized avatar in a virtual reality environment. The video was uploaded to Wikimedia Commons by the Open Access Media Importer Bot.
In his Communication and Society PhD dissertation, Randall M. Livingstone of the University of Oregon explores the relationship between the social and technical structures of Wikipedia, with a particular focus on bots and bot operators. After a fairly broad literature review (which summarizes the basic approaches to Wikipedia studies from new media theory, social network analysis, science and technology studies, and political economy), Livingstone gives a concise history of the technical development of Wikipedia, from UseModWiki to MediaWiki, and from a single server to hundreds.
The most interesting chapters for Wikipedians will be V – Wikipedia as a Sociotechnical System – and VI – Wikipedia as Collective Intelligence. Chapter 5 looks at the ways the editing community and the evolution of software (both MediaWiki and the semi-automated tools and bots that interact with editors and articles) "construct" each other. Based on 45 interviews with bot operators and WMF staff, this chapter gives an interesting and varied picture of how Wikipedia works as a sociotechnical system. It will in part be a familiar account to the more tech-minded Wikipedians, but offers an accessible overview of bots and their place in the ecosystem to editors who normally steer clear of bots and software development. Chapter 6 looks at theories of intelligence and the concept of collective intelligence, arguing that Wikipedia exhibits (at least to some extent) the key traits of stigmergy, distributed cognition, and emergence.
"History's most influential people" according to Wikipedia: While more in the realm of popular science, Wired UK, among others, published an infographic attributed to César Hidalgo, head of the MIT Media Lab's Macro Connections group, visualizing "History's most influential people". Unfortunately, beyond noting that rankings "are based on parameters such as the number of language editions in which that person has a page, and the number of people known to speak those languages" the small article does not provide any methodology, nor does it provide much discussion. Until a more extensive description is released, the current graph, while pretty, is little more than a trivia piece.
Teachers say 75% of teens use Wikipedia (or online encyclopedias) for research assignments: In a Pew Research survey among more than 2000 US middle and high school teachers 75% said that their teenage students use "Wikipedia or other online encyclopedia" in research assignments, making online encyclopedias the second most popular source for students behind search engines such as Google. This number was lower (68%) "among teachers of the lowest income students (those living below the poverty line)" and higher (80%) for those teaching "mostly upper and upper middle income" students, and it also varied by subject (between 69% for teachers of English and 82% for science teachers). The survey report cautions that the sample "skews towards 'cutting edge' educators who teach some of the most academically successful students in the country".
The Google matrix of Wikipedia entries, from an earlier paper by the same authors of this study.
"Wikipedia communities" as eigenvectors of its Google matrix: An ArXiv preprint studies the "Spectral properties of Google matrix of Wikipedia and other networks". This Google matrix consists of entries for each pair of pages (for the English Wikipedia, including non-mainspace pages like portals), roughly speaking modelling the behavior of a surfer who goes from one page to any of those that it links to, with equal probability (or, with probability , jumps to a random page; the damping parameter is set to around 0.85 in the Google search engine). The PageRank appears as the eigenvector of this matrix for the eigenvalue. The paper studies the spectrum (eigenvalues) and eigenvectors apart from this special case, interpreting them as certain topic areas: "the eigenvectors of the Google matrix of Wikipedia clearly identify certain communities which are relatively weakly connected with the Wikipedia core when the modulus of corresponding eigenvalue is close to unity. For moderate values of we still have well defined communities which however have stronger links with some popular articles (e.g. countries) that leads to a more rapid decay of such eigenmodes."
Serial singularities: developing of a network organization by organizing events: In a paper published in the Schmalenbach Business Review, Leonhard Dobusch and Gordon Müller-Seitz from the Freie Universität Berlin suggest that research on organized events has tended to treat those events as isolated and singular events. Using interviews and other data on Wikimania, chapter meetings, and local meet-ups over several years, the authors challenge this idea and show how many different events on different scales and scopes – each with a distinct character – can interact and reinforce each other to help drive the nature of a large distributed organization like Wikimedia.
The web mirrors value in the real world: comparing a firm’s valuation with its web network position: In a MIT Sloan Working Paper, Qiaoyun Yun and Peter Gloor create a measure of US and Chinese firms "social network" position by looking at how those firms are linked to from a variety of web sources – prominently Wikipedia. They find a positive correlation between betweenness centrality of a firm in a social network constructed from links online and its innovation capability and financial performance. They find that Wikipedia only predicts a firm's performance in the US.
Teahouse analyzed: Jonathan Morgan, Sarah Stierch, Siko Bouterse and Heather Walls, from the Wikimedia Foundation Teahouse team, report on the impact of the initiative on 1,098 new Wikipedia contributors who joined the Teahouse between February and October 2012, in a paper to be presented at CSCW '13. The study reports that participants in the project "make more edits overall, and edit longer", "make more edits, to more articles" and "participate more in discussion spaces" compared to non-visitors. This paper is part of a research track entirely dedicated to Wikipedia Supported Collaborative Work, featuring three other studies.
Slides from the recently published Article Feedback research report.
Article feedback: The Wikimedia Foundation published an update about the Article feedback tool on the English Wikipedia, providing statistics about the usage of the feature, and about the moderation activities for the feedback provided.
New review of Good Faith Collaboration: The reviewer locates Joseph Reagle's 2010 book about Wikipedia (free online version) as following in a wider context of research on Wikipedia: "The reliability of the encyclopaedia’s content.. and quantitative analysis of large-scale public datasets formed the predominant approach in early empirical research on Wikipedia ... This was followed by a more social approach and the adopting of qualitative methods. In this switch to social norms and away from an ethnographic approach, Reagle's book is a main reference, particularly in terms of its cultural and historical specificity." Overall, the review finds that "The book is well documented, with an elaborative but accessible writing style, which is at times provocative. It results in a form of rich composition of eight pieces (chapters) of Wikipedia 'puzzle', even if some readers might miss a more explicit continuum linking the lines together. Finally, the book is a primary reference point for researchers aiming to study Wikipedia, especially for those unfamiliar with it."
Measuring the impact of Wikipedia for GLAM institutions: Ed Baker, software developer at the Natural History Museum in London, has started a series of blog posts on "the impact and use of Wikipedia by organisations". In the first post, he looked at how the scope of pages linking to the NHM's website fits with the overall scope of the institution when pages are ranked either by number of page views or by number of links to the NHM. The latter approach could help identify opportunities for a collaboration between GLAM institutions and the Wikimedia communities.
^Keegan, B. (2012). How does Wikipedia deal with a mass shooting? A frenzied start gives way to a few core editors. Nieman Journalism LabHTML
^Seward, Z.M. (2012) Why Wikipedia beats Wikinews as a collaborative journalism project. Nieman Journalism LabHTML
^Yasseri, T. (2012) The coverage of a tragedy. Stories for Sunday morningHTML
^Wang, C. (Alex), & Zhang, X. (Michael). (2012). Network Centrality and Contributions to Online Public Good–The Case of Chinese Wikipedia. 2012 45th Hawaii International Conference on System Sciences (pp. 4515–4524). IEEE. DOI
^López Marcos, P.; Sanz-Valero, J. (2012). "Presencia y adecuación de los principios activos farmacológicos en la edición española de la Wikipedia". Atención Primaria. doi:10.1016/j.aprim.2012.09.012.
The sentence, "Navigation templates located at the bottom of articles may be given a section heading such as "Related information", although the use of such headings has not yet been widely adopted," is under review. Should navboxes in articles have their own section headings?
The policy that "Album articles with little more than a track listing may be more appropriately merged into the artist's main article or discography article, space permitting" is under review. Redirection is used more often than merging.
A change to the policy of resysopping former administrators is under discussion. This discussion hopes to clear up cases of when to resysop admins and when they should go through the request process again.
What motivated you to join WikiProject New York City? Do you live in or near the Big Apple? Have you contributed to any of the project's Featured or Good Articles?
I've lived in Manhattan 45 years and seen much geography in 50 years bicycling. Have contributed to probably over a thousand local articles including a few thousand photographs. Don't know whether any of the articles reached Featured or Good status.
With New York ranking as the most populous city in the United States and one of the largest metropolitan areas in the world, is WikiProject New York City bustling with activity? What challenges does the project face with attracting and retaining editors?
The local chapter is active; we have several meetings per year. The Project likewise, especially the Transportation subproject. We have much to do, and many who are eager to do it.
Are any of the city's boroughs better covered than others on Wikipedia? What communities could use some attention? Has the project experienced difficulties acquiring photography for areas of the city that don't attract many tourists?
My own idylic little island of Manhattan gets the majority of attention and especially photography, so for me it serves as home base for photo expeditions to places off the beaten touristic path including South Bronx and Gravesend last week, Rockaway day before yesterday, and the much-neglected Glendale, Ridgewood and Bushwick tomorrow.
New York's complex transportation system demands the attention of many articles under the project's scope. Does the project collaborate with any WikiProjects focusing on roads, mass transit, or waterways? Are there any articles or images that New Yorkers could easily contribute during their daily commute?
We've got our own bustling WP:WikiProject New York City Public Transportation whose actual purview is wider geographically and topically. Yes, usually on a train trip I snap a few Wikiphotos. Commuters ought to familiarize themseves with articles covering places along their route, and keep their eyes open and camera phone or compact camera ready.
Has the project worked with any galleries, libraries, archives, or museums to establish GLAM projects? Are there any institutions in New York that you'd like to see sponsor a GLAM project?
At the beginning of the year, we began a series of interviews with editors who have worked hard to combat systemic bias through the creation of featured content; although we haven't seen six installments yet, we've also had some delightful interviews with people who write articles on some of our most core topics. Now, as we close the year, I would like to present some of my own musings on the state of featured content – especially as it pertains to systemic bias and core topics.
For me, having a work promoted to featured status is the rough equivalent of having it published in a traditional encyclopedia; it is recognition that the article, picture, and the like is up to snuff. As such, although there is personal glory in turning a 300-word stub into a featured article, like at Sudirman, that is not the goal of featured content. By writing, photographing, or recording featured-level content, we are legitimising the crowd-sourcing method used by Wikipedia and showing that the 99 per cent can make a difference. This is not to say that featured content is the only quality content on Wikipedia: numerous articles and images are on par or better than those found in paper encyclopedias, but owing to... let's say human nature... will have a difficult time at the content featured processes.
If compared to traditional encyclopedias, Wikipedia's expansive coverage of popular culture is second to none. However, in my experience there are two areas where we fall decidedly behind Britannica or Americana: our coverage of common-knowledge topics and our coverage of areas outside the Anglosphere. As such, our articles on writers or the Masalembu Islands are definitely in need of some tender loving care.
This can be challenging. For broader topics, the scope is typically daunting, and sources need to be chosen very selectively. Even when an editor or group of editors is willing to take on a topic, they may find themselves the target of edit warriors. A drive to improve information technology was derailed by arguments over capitalisation, while tree has been the target of (over)extensive tagging. For more minor topics – especially on non-Anglosphere topics – finding quality English-language sources can be impossible, forcing editors to use sources in the local language. Several of my articles, such as Oerip Soemohardjo, are by necessity almost devoid of English sources as the most in-depth looks were in Indonesian.
It can be done. This year, the Core Contest, which focuses on topics which every encyclopedia should have, ran twice. This resulted in several featured articles, including the above lettuce, as well as major expansions and improvements on topics ranging from language to the Alps. Other editors have taken on major topics more or less on their own, like at entertainment or the aforementioned Douglas MacArthur. For non-Anglophone areas, I have provided some seventeen pieces of featured content related to Indonesia, while editors such as (but not limited to) Lecen, MrPanyGoff, Muhammad Mahdi Karim, Arsenikk and Lemurbaby have worked extensively to bring quality content from their preferred areas.
Will 2013 bring more core and non-Anglophonic featured content, or was the 21st the end of that? Here's looking at you, in the new year.
Afroyim v. Rusk (nom) by Richwales. Afroyim v. Rusk is a 1967 US Supreme Court case in which it was ruled that US citizens may not be involuntarily deprived of their citizenship. The government's attempt to revoke the citizenship of Beys Afroyim after the latter voted in an Israeli election was deemed unconstitutional. The decision opened the way for a wider legal acceptance of multiple citizenship and has sparked policy changes.
Auriscalpium vulgare (nom) by Sasata. First described in 1753, A. vulgare is a species of fungus common throughout Europe, Central America, North America, and temperate Asia. The small mushroom requires high humidity and medium light for optimum development and often grows on conifer litter or cones that are buried in soil. The brown-capped fungus body is generally considered inedible owing to its rough texture.
SMS Kaiserin (nom) by Parsecboy. Kaiserin was the third vessel of the Kaiser class of battleships of the German Imperial Navy. Laid down in 1910, she was launched in 1911 and commissioned in 1913. Equipped with ten 30.5-centimeter (12.0 in) guns and with a top speed of 22.1 knots (40.9 km/h; 25.4 mph), the ship saw action throughout World War I. After the war, German crews scuttled her along with most of the German ships interned at Scapa Flow to prevent their seizure by the British.
Oregon Trail Memorial half dollar (nom) by Wehwalt. The Oregon Trail Memorial half dollar was struck intermittently by the US Bureau of the Mint between 1926 and 1939. Designed by Laura Gardin Fraser and James Earle Fraser, it commemorates those who traveled the Oregon Trail in the mid-19th century after a campaign by Ezra Meeker. Owing to public outcry over high prices, the government stopped minting the coins after only about 260,000 were produced.
Look Mickey (nom) by TonyTheTiger. Look Mickey is a Roy Lichtenstein oil-on-canvas painting considered a bridge between his abstract expressionism and pop art works. The 1961 painting, which shows Donald Duck and Mickey Mouse fishing, is the first example of Lichtenstein's employment of Ben-Day dots, speech balloons and comic imagery. Used in the artist's first solo exhibition, Look Mickey has been read as satirising pop culture's mass production of visual imagery.
List of songs recorded by Katy Perry (nom) by Calvin999. The American singer Katy Perry has recorded songs for three studio albums, beginning in gospel but switching to pop afterwards. She has released forty-seven songs in total, including seventeen singles and two promotional singles.
Matchbox Twenty discography (nom) by Holiday56. The American rock band Matchbox Twenty has released four studio albums, one compilation album, three video albums, two extended plays, twenty-three singles and sixteen music videos since their debut in 1996. This debut album, Yourself or Someone Like You, remains their best selling album.
List of Dragon Quest media (nom) by PresN. The video game series Dragon Quest, published by Square Enix, has seen ten main instalments as well as numerous spin-off games and tie-in media, including books, television series, and soundtrack albums. The first game was released in 1986.
Nelly discography (nom) by Sufur222. The American rapper and singer Nelly has released nine albums, two extended plays, two mixtapes, forty-seven singles, and forty-five music videos since making his debut in 2000. His debut album Country Grammar is his most successful to date, selling nearly 8.5 million copies.
Common Starling (nom; related article), created by PierreSelim and nominated by Tomer T. The Common Starling (Sturnus vulgaris) is a species of bird native to most of temperate Europe and western Asia. First described in 1758, it consists of several subspecies.
Galina Vishnevskaya (nom; related article), created by Yustas, edited by Diliff, and nominated by Julia W. Galina Pavlovna Vishnevskaya (1926–2012) was a Russian soprano opera singer, recitalist, and wife of cellist Mstislav Rostropovich. She was named a People's Artist of the USSR in 1966.
Pismis 24 (nom; related article), created by NASA, ESA and Jesœs Maz Apellÿniz and nominated by Mediran. Pismis 24 is an open cluster located in the nebula NGC 6357, home to numerous enormous stars – including two of more than 100 solar masses in a binary system.
A video about one new feature launched in 2012 (Special:NewPagesFeed and its associated page curation toolbar) displayed using another (the TimedMediaHandler).
In the first of two features, the Signpost this week looks back on 2012, a year when developers finally made inroads into three issues that had been put off for far too long (the need for editors to learn wiki-markup, the lack of a proper template language and the centralisation of data) but left all three projects far from finished.
The overall result was a year of numerous incremental changes (including Special:NewPagesFeed, new diff colours, MathJax support, high-resolution images, database dump mirroring) but few genuinely watershed moments. One exception, however, was the switchover in version control system from Subversion to Git in March. For a complex transition, the switch was made relatively easily and more-or-less on schedule, although the top-down nature of it – and in particular the choice of code review tool Gerrit – continues to rankle with some developers even now. The possibility of entrenching a division between staff and volunteer-written code, highlighted in last year's annual review, was successfully avoided, though the presence of a de facto distinction – first established by a Signpost investigation in September – is an ongoing concern likely to remain on the agenda for most if not all of 2013.
Other big gainers included Wikipedia Zero, the Wikimedia Foundation's drive to make a (sometimes text-only) version of its flagship project available for free on internet-enabled handsets across the developing world, which went from strength to strength over the course. Despite only being in development this time last year, quarter of a billion people are now estimated to have free access under the system, with more than a dozen further partnerships already agreed. An Android app was also released in 2012, and the predicted mobile web upload facility is now in development, building on continuing from the success of a "Wiki Loves Monuments" app that included similar functionality. A mobile editing interface, scheduled for March, was not so lucky. As forecast, support for accessing Wikipedia via SMS/USSD has now been implemented, though it is yet to go live.
There were low points too, both technical and social. Downtime was not as rare as the Wikimedia Operations team would have liked, while untested (or insufficiently tested) code, deployed live, caused problems on a similar scale. The fine line between constructive criticism and personal attacks, particularly in the context of top-down decisions, remained well trodden, not least in the context of the rise of Wikimedia Labs at the expense of the independently-run (but not financed) Toolserver.
On a more positive note, the TimedMediaHandler extension (improving MediaWiki's handling of video files) was finally deployed to Wikimedia wikis following a drawn out development process few would wish to emulate. Only time will tell whether the lessons learned will ensure the Lua coding, VisualEditor and Wikidata projects – now 15, 13 and 9 months old respectively – can reach the same end any quicker; but more on that next week, when the Signpost looks forward to what 2013 may have in store for Wikimedia wikis and MediaWiki more generally.
Not all fixes may have gone live to WMF sites at the time of writing; some may not be scheduled to go live for several weeks.
Tor be or not Tor be: developers consider the problems of anonymised editing: Trying to provide a fuller service to users of anonymity network Tor may be "aiming too high", wrote one contributor to a thread on the mailing list this week. Tor, a controversial system for anonymously accessing the web, is popular among citizens of countries such as China, Burma and Iran as a means of safely evading tight censorship controls, but can also be easily manipulated by repeat vandals to evade blocks. Differentiating between the two is at present virtually impossible, with users blocked from editing unless they can prove their good intentions – something which is impossible if they have no other way of editing. Although some incremental improvements were proposed to the process of blocking Tor users, most obviously in the literature provided, it is unclear whether there is any way of allowing editing whilst also preventing abuse.
Translate extension getting facelift: The Translate extension, which helps users translate (among other things) MediaWiki interface messages is benefiting from a WMF-led visual overhaul, WMF Software Engineer Amir E. Aharoni writes on the Wikimedia blog. The facelift is intended to make it easier to translate new messages, as well as allowing the host wiki (translatewiki.net for MediaWiki interface messages) to surface statistics about which messages out of a batch users struggle to translate.