Academic studies about Wikipedia: Difference between revisions
→Work distribution and social strata: dead link replacement |
Rescuing 2 sources and tagging 0 as dead. #IABot (v1.4beta3) |
||
Line 80: | Line 80: | ||
===Demographics=== |
===Demographics=== |
||
A 2007 study by [[Hitwise]], reproduced in ''[[Time Magazine|Time]]'' magazine,<ref>{{cite news|url=http://www.time.com/time/business/article/0,8599,1614751,00.html|title=Who's Really Participating in Web 2.0|author=Bill Tancer|publisher=''[[Time Magazine]]''|date=2007-04-25|accessdate=2007-04-30| |
A 2007 study by [[Hitwise]], reproduced in ''[[Time Magazine|Time]]'' magazine,<ref>{{cite news|url=http://www.time.com/time/business/article/0,8599,1614751,00.html |title=Who's Really Participating in Web 2.0 |author=Bill Tancer |publisher=''[[Time Magazine]]'' |date=2007-04-25 |accessdate=2007-04-30 |archiveurl=https://web.archive.org/web/20070430011528/http://www.time.com/time/business/article/0%2C8599%2C1614751%2C00.html |archivedate=30 April 2007 |deadurl=no }}</ref> found that visitors to Wikipedia are almost equally split 50/50 male/female, but that 60% of edits are made by male editors. |
||
[[WikiWarMonitor]] which is part of the [[European Commission]], [[CORDIS]] FP7 FET-Open supported project called ICTeCollective, have published: |
[[WikiWarMonitor]] which is part of the [[European Commission]], [[CORDIS]] FP7 FET-Open supported project called ICTeCollective, have published: |
||
Line 347: | Line 347: | ||
* {{cite book |last=Blumenstock |first=J. E. |year=2008 |chapter=Size matters: word count as a measure of quality on Wikipedia |title=Proceedings of the 17th international conference on World Wide Web |pages=1095–1096 |publisher=ACM |location=New York |isbn=978-1-60558-085-2 |doi=10.1145/1367497.1367673}} |
* {{cite book |last=Blumenstock |first=J. E. |year=2008 |chapter=Size matters: word count as a measure of quality on Wikipedia |title=Proceedings of the 17th international conference on World Wide Web |pages=1095–1096 |publisher=ACM |location=New York |isbn=978-1-60558-085-2 |doi=10.1145/1367497.1367673}} |
||
* {{cite book |last1=Bryant |first1=S. L. |last2=Forte |first2=A. |last3=Bruckman |first3=A. |year=2005 |chapter=Becoming Wikipedian: transformation of participation in a collaborative online encyclopedia |title=GROUP '05 Proceedings of the 2005 international ACM SIGGROUP conference on Supporting group work |publisher=ACM |location=New York |isbn=1-59593-223-2 |doi=10.1145/1099203.1099205}} |
* {{cite book |last1=Bryant |first1=S. L. |last2=Forte |first2=A. |last3=Bruckman |first3=A. |year=2005 |chapter=Becoming Wikipedian: transformation of participation in a collaborative online encyclopedia |title=GROUP '05 Proceedings of the 2005 international ACM SIGGROUP conference on Supporting group work |publisher=ACM |location=New York |isbn=1-59593-223-2 |doi=10.1145/1099203.1099205}} |
||
* {{cite journal |
* {{cite journal|last=Farrell |first=H. |last2=Schwartzberg |first2=M. |date=2008 |title=Norms, Minorities, and Collective Choice Online |journal=Ethics & International Affairs |volume=22 |issue=4 |publisher=Carnegie Council for Ethics in International Affairs |url=http://www.cceia.org/resources/journal/22_4/essays/002.html |accessdate=2009-02-03 |archiveurl=https://web.archive.org/web/20090120200727/http://www.cceia.org/resources/journal/22_4/essays/002.html |archivedate=20 January 2009 |deadurl=yes }} |
||
* {{cite book |last1=Hu |first1=M. |last2=Lim |first2=E.-P. |last3=Sun |first3=A. |last4=Lauw |first4=H. W. |last5=Vuong |first5=B.-Q. |year=2007 |chapter=Measuring article quality in Wikipedia: models and evaluation |title=Proceedings of the sixteenth ACM conference on Conference on information and knowledge management |publisher=ACM |location=New York |isbn=978-1-59593-803-9 |doi=10.1145/1321440.1321476}} |
* {{cite book |last1=Hu |first1=M. |last2=Lim |first2=E.-P. |last3=Sun |first3=A. |last4=Lauw |first4=H. W. |last5=Vuong |first5=B.-Q. |year=2007 |chapter=Measuring article quality in Wikipedia: models and evaluation |title=Proceedings of the sixteenth ACM conference on Conference on information and knowledge management |publisher=ACM |location=New York |isbn=978-1-59593-803-9 |doi=10.1145/1321440.1321476}} |
||
* {{cite journal |last=Jensen |first=R. |year=2012 |title=Military History on the Electronic Frontier: Wikipedia Fights the War of 1812 |url=http://www.americanhistoryprojects.com/downloads/JMH1812.PDF |journal=Journal of Military History |volume=76 |issue=4 |pages=523–556}} |
* {{cite journal |last=Jensen |first=R. |year=2012 |title=Military History on the Electronic Frontier: Wikipedia Fights the War of 1812 |url=http://www.americanhistoryprojects.com/downloads/JMH1812.PDF |journal=Journal of Military History |volume=76 |issue=4 |pages=523–556}} |
Revision as of 18:43, 25 June 2017
Ever since Wikipedia was a few years old, there have been numerous academic studies about Wikipedia in peer-reviewed publications. This research can be grouped into two categories. The first analyzed the production and reliability of the encyclopedia content, while the second investigated social aspects, such as usage and administration. Such studies are greatly facilitated by the fact that Wikipedia's database can be downloaded without help from the site owner.[1]
Content
Production
A minority of editors produce the majority of persistent content
In a landmark peer-reviewed paper,[2] also mentioned in The Guardian,[3] a team of six researchers from the University of Minnesota measured the relationship between editors' edit count and the editors' ability to convey their writings to Wikipedia readers, measured in terms of persistent word views (PWV)—the number of times a word introduced by an edit is viewed. The accounting method is best described using the author's own words: "each time an article is viewed, each of its words is also viewed. When a word written by editor X is viewed, he or she is credited with one PWV." The number of times an article was viewed was estimated from the web server logs.
The researchers analyzed 25 trillion PWVs attributable to registered users in the interval September 1, 2002 − October 31, 2006. At the end of this period, the top 10% of editors (by edit count) were credited with 86% of PWVs, the top 1% about 70%, and the top 0.1% (4200 users) were attributed 44% of PWVs, i.e. nearly half of Wikipedia's "value" as measured in this study. The top 10 editors (by PWV) contributed only 2.6% of PWVs, and only three of them were in top 50 by edit count. From the data, the study authors derived the following relationship:
Growth of PWV share increases super-exponentially by edit count rank; in other words, elite editors (those who edit the most times) account for more value than they would [be attributed] given a power-law relationship.
The study also analyzed the impact of bots on content. By edit count, bots dominate Wikipedia; 9 of the top 10 and 20 of the top 50 are bots. In contrast, in the PWV ranking only two bots appear in the top 50, and none in the top 10.
Based on the steady growth of the influence on those top 0.1% editors by PWV, the study concluded unequivocally:
... Frequent editors dominate what people see when they visit Wikipedia and ... this domination is increasing.
Work distribution and social strata
A peer-reviewed paper noted the "social stratification in the Wikipedia society" due to the "admins class". The paper suggested that such stratification could be beneficial in some respects but recognized a "clear subsequent shift in power among levels of stratification" due to the "status and power differentials" between administrators and other editors.[4]
Analyzing the entire edit history of Wikipedia up to July 2006, the same study determined that the influence of administrator edits on contents has steadily diminished since 2003, when administrators performed roughly 50% of total edits, to 2006 when only 10% of the edits were performed by administrators. This happened despite the fact that the average number of edits per administrator had increased more than fivefold during the same period. This phenomenon was labeled the "rise of the crowd" by the authors of the paper. An analysis that used as metric the number of words edited instead of the number of edit actions showed a similar pattern. Because the admin class is somewhat arbitrary with respect to the number of edits, the study also considered a breakdown of users in categories based on the number of edits performed. The results for "elite users", i.e. users with more than 10,000 edits, were somewhat in line with those obtained for administrators, except that "the number of words changed by elite users has kept up with the changes made by novice users, even though the number of edits made by novice users has grown proportionally faster". The elite users were attributed about 30% of the changes for 2006. The study concludes:
Thus though their influence may have waned in recent years, elite users appear to continue to contribute a sizeable portion of the work done in Wikipedia. Furthermore, ... edits made by elite users appear to be substantial in nature. ...
Reliability
Jean Goodwin has assessed whether trust in Wikipedia is based on epistemic or pragmatic merits. While readers may not assess the actual knowledge and expertise of the authors of a given article, they may assess the contributors' passion for the project, and communicative design through which that passion is made manifest, and provide a reason for trust.[5]
Geography
Wikipedia articles cover about half a million places on Earth. However, research conducted by the Oxford Internet Institute has shown that the geographic distribution of articles is highly uneven. Most articles are written about North America, Europe, and East Asia, with very little coverage of large parts of the developing world, including most of Africa.[6]
Natural language processing
The textual content and the structured hierarchy of Wikipedia has become an important knowledge source for researchers in natural language processing and artificial intelligence. In 2007 researchers at Technion – Israel Institute of Technology developed a technique called Explicit Semantic Analysis[7] which uses the world knowledge contained in Wikipedia articles. Conceptual representations of words and texts are created automatically and used to compute the similarity between words and texts.
Researchers at Ubiquitous Knowledge Processing Lab use the linguistic and world knowledge encoded in Wikipedia and Wiktionary to automatically create linguistic knowledge bases which are similar to expert-built resources like WordNet.[8] Strube and Ponzetto created an algorithm to identify relationships among words by traversing Wikipedia via its categorization scheme, and concluded that Wikipedia had created "a taxonomy able to compete with WordNet on linguistic processing tasks."[9]
Critiques of content fields
Health information
Health information on Wikipedia is popularly accessed as results from search engines and Search engine result page, which frequently deliver links to Wikipedia articles.[10] Independent assessments of the quality of health information provided on Wikipedia and of who is accessing the information have been undertaken. The number and demographics of people who seek health information on Wikipedia, the scope of health information on Wikipedia, and the quality of the information on Wikipedia have been studied.[11] There are drawbacks to using Wikipedia as a source of health information.
Social aspects
Demographics
A 2007 study by Hitwise, reproduced in Time magazine,[12] found that visitors to Wikipedia are almost equally split 50/50 male/female, but that 60% of edits are made by male editors.
WikiWarMonitor which is part of the European Commission, CORDIS FP7 FET-Open supported project called ICTeCollective, have published:
in 2011 in IEEE Xplore "Edit wars in Wikipedia" for IEEE Third International Conference on Social Computing (SocialCom).[13]
In 2012 in PLoS ONE, it reported that based on circadian activity pattern analysis, the shares of contributions to English Wikipedia, from North America and Europe are almost equal, whereas this increases to 75% of European contributions for the Simple English Wikipedia. The research also covers some other demographic analysis on the other editions in different languages.[14]
In 2013 in Physical Review Letters "Opinions, Conflicts, and Consensus: Modeling Social Dynamics in a Collaborative Environment".[15]
In 2014 published as a book "The Most Controversial Topics in Wikipedia: A Multilingual and Geographical Analysis"; analysed the volume of editing of articles in various language versions of Wikipedia in order to establish the most controversial topics in different languages and groups of languages. For the English version, the top three most controversial articles were George W. Bush, Anarchism and Muhammad. Topics in other languages causing most controversy were Croatia (German), Ségolène Royal (French), Chile (Spanish) and Homosexuality (Czech).[16]
Policies and guidelines
A descriptive study[17] that analyzed English language Wikipedia's policies and guidelines up to September 2007 identified a number of key statistics:
- 44 official policies
- 248 guidelines
Even a short policy like "ignore all rules" was found to have generated a lot of discussion and clarifications:
While the "Ignore all rules" policy itself is only sixteen words long, the page explaining what the policy means contains over 500 words, refers readers to seven other documents, has generated over 8,000 words of discussion, and has been changed over 100 times in less than a year.
The study sampled the expansion of some key policies since their inception:
- Wikipedia:Ignore all rules: 3600% (including the additional document explaining it)
- Wikipedia:Consensus: 1557%
- Wikipedia:Copyrights: 938%
- Wikipedia:What Wikipedia is not: 929%
- Wikipedia:Deletion policy: 580%
- Wikipedia:Civility: 124%
The number for "deletion" was considered inconclusive however because the policy was split in several sub-policies.
Power plays
A 2007 joint peer-reviewed study[18] conducted by researchers from the University of Washington and HP Labs examined how policies are employed and how contributors work towards consensus by quantitatively analyzing a sample of active talk pages. Using a November 2006 database dump, the study focused on 250 talk pages in the tail of the distribution: 0.3% of all talk pages, but containing 28.4% of all talk page revisions, and more significantly, containing 51.1% of all links to policies. From the sampled pages' histories, the study examined only the months with high activity, called critical sections—sets of consecutive months where both article and talk page revisions were significant in number.
The study defined and calculated a measure of policy prevalence. A critical section was considered policy-laden if its policy factor was at least twice the average. Articles were tagged with 3 indicator variables:
- controversial
- featured
- policy-laden
All possible levels of these three factors yielded 8 sampling categories. The study intended to analyze 9 critical sections from each sampling category, but only 69 critical sections could be selected because only 6 articles (histories) were simultaneously featured, controversial, and policy laden.
The study found that policies were by no means consistently applied. Illustrative of its broader findings, the report presented the following two extracts from Wikipedia talk pages in obvious contrast:
- a discussion where participants decided that calculating a mean from data provided by a government agency constituted original research:
is the mean...not considered original research? [U3]
It doesn't look like it to me, it looks like the original research was done by [Gov't agency] or am I missing something? [U4]
If the [Gov't agency] has not published the actual mean, us "calculating" it would be OR, no? I'm not sure. [U3]
No, why would it be? Extrapolating data from info already available is not OR. [U5]
From WP:NOR "articles may not contain any new analysis or synthesis of published arguments, concepts, data, ideas or statements that serves to advance a position." For what' worth... [U4]
- a discussion where logical deduction was used as counterargument for the original research policy:
Your notion is WP:OR. I can easily provide. . . a scholarly article that says that anti-authoritarianism is not central to Panism. You are synthesizing all kinds of ideas here, based on your POV. [U6]
Simple deductive reasoning is not original research. Panism is inherently anti-authoritarian; therefore, an authoritarian economic system cannot be Panist. Which do you disagree with: the premise or the conclusion? [U7]
Claiming that such ambiguities easily give rise to power plays, the study identified, using the methods of grounded theory (Strauss), 7 types of power plays:
- article scope (what is off-topic in an article)
- prior consensus (past decisions presented as absolute and uncontested)
- power of interpretation (a sub-community claiming greater interpretive authority than another)
- legitimacy of contributor (his/her expertise)
- threat of sanction (blocking etc.)
- practice on other pages (other pages being considered models to follow)
- legitimacy of source (authority of references being disputed)
Due to lack of space, the study detailed only the first 4 types of power plays that were exercised by merely interpreting policy. A fifth power play category was analyzed; it consisted of blatant violations of policy that were forgiven because the contributor was valued for his contributions despite his lack of respect for rules.
Article scope
The study considers that Wikipedia's policies are ambiguous on scoping issues. The following vignette is used to illustrate the claim:
... consensus is bullshit because I have the facts on my side. I also have the exhortation of Wikipedia to be bold... deleting a discussion of the Catholic church's... view of paleocentrism is not only inaccurate, but violates WP:NPOV ... .Deleting/emasculating it would violate several Wikipedia policies: NPOV, be bold... If you all want an article just on the scientific theory of paleocentrism, write one yourself. [U12]
We DID write an article just on the scientific theory of paleocentrism, before you showed up... You're obviously new here, [U12]... arguing based on your reading of NPOV and Be bold is a bit ridiculous, like a kid just out of high school arguing points of constitutional law. These things are principles that have an established meaning. People who have been here for years understand them much better than you do. They won't prove effective weapons for you to wield in this argument... [U13]
The social impact of "paleocentrism" is not "paleocentrism"... Wikipedia:wiki is not paper, we don't need to cram every tertiary aspect of the topic into the article proper, and we don't need to consider it incomplete when we don't ... [U14]
... the first thing the link Wikipedia:wiki is not paper says is:""Wikipedia "is" an encyclopedia."" A real encyclopedia like Encyclopædia Britannica has a fantastic section on paleocentrism, including all the social, political, and philosophical implications. [U12]
As discussed at Wikipedia:wiki is not paper, Wikipedia articles should give a brief overview of the centrally important aspects of a subject. To a biologist like yourself, the centrally aspect of paleocentrism certainly isn't its social implications, but to the rest of society it is. ... [U12]
... What you're talking about isn't "paleocentrism". Central issues to paleocentrism are periodic equilibrium, geomorphous undulation, airation. These are the issues that actually have to do with the process of paleocentrism itself. These "social aspects" you're talking about are "peripheral", "not central". They are "about" paleocentrism, they "surround" paleocentrism, but they "are not paleocentrism"... [U15]
The study gives the following interpretation for the heated debate:
Such struggles over article scope take place even in a hyper-linked environment because the title of an article matters. The "paleocentrism" article is more prestigious and also more likely to be encountered by a reader than an article entitled "the social effect of paleocentrism."
Prior consensus
The study remarks that in Wikipedia consensus is never final, and what constitutes consensus can change at any time. The study finds that this temporal ambiguity is fertile ground for power plays, and places the generational struggle over consensus in larger picture of the struggle for article ownership:
In practice, ... there are often de facto owners of pages or coalitions of contributors that determine article content. Prior consensus within this group can be presented as incontestable, masking the power plays that may have gone into establishing a consensus. ... At issue is the legitimacy of prior consensus. Longtime contributors do not want to waste time having arguments about issues that they consider to be solved. Pointing to prior consensus, just like linking to policies, provides a method for dealing with trollish behavior. On the other hand, newcomers or fringe contributors often feel that their perspectives were not represented in prior arguments and want to raise the issue again.
The study uses the following discussion snippet to illustrate this continuous struggle:
Most all the stuff [U17] describes below has already been hashed out. . . It's like that game of whack-a-mole: they try one angle, it gets refuted; they try a second angle, it gets refuted; they try a third angle, it gets refuted; and then they try the first angle again. [U18]
It would be interesting to see how many different users try to contribute to this article and to expand the alternate views only to be bullied away by those who believe in [Cosmic Polarity] religiously... why don't you consider that perhaps they have a point and that [U19], [U20] and the rest of you drive editors away from this article with your heavy-handed, admin-privileged POV push? [U21]
Power of interpretation
A vignette illustrated how administrators overrode consensus and deleted personal accounts of user/patients suffering from an anonimized illness (named Frupism in the study). The administrator's intervention happened as the article was being nominated to become as a featured article.
Legitimacy of contributor
This type of power play is illustrated by a contributor (U24) that draws on his past contributions to argue against another contributor who is accusing U24 of being unproductive and disruptive:
Oh, you mean "I" hang around to make a point about the lack of quality on Wikipedia? Please take another look at my edit count!! LOL. I have over 7,000 edits... As you know, I can take credit for almost entirely writing from scratch 2 of the 6 or 7 FAs in philosophy... [U24]
Explicit vie for ownership
The study finds that there are contributors who consistently and successfully violate policy without sanction:
U24 makes several blatant "us or them" vies for power: if U25's actions persist, he will leave. ... Such actions clearly violate policies against article ownership, civility toward other contributors, and treatment of newcomers. As a newcomer, U25 may not know of these policies, but U26 certainly does. The willing blindness [of U26] stems from the fact that U24 is a valued contributor to philosophy articles and is not bashful about pointing this out. There is a scarcity of contributors with the commitment to consistently produce high-quality content; the Wikipedian community is willing to tolerate abuse and policy violations if valued work is being done. ...
With all due respect, that didn't answer the question... I wanted to know what it was in U25's proposal which was unacceptable. . . His lack of reference etc. is all a fault, sure, but that's why I provided one (Enquiry, section 8). [U26]
... this point is already addressed in the article... It may need to be expanded a bit. I can easily do that myself when I have time... Is there anythin else? Do you also support U25's vie that the article is "poor", that is needs to overhauled from top to bottom, the meanignlsess nonsens that he actually did try to insert above or the other OR that he has stated on this page? Basically, there are two sides on this matter, this article can be taken over by cranks like what's his name, or not? If it does, I go. You can either support me or not. Where do you stand?... [U24]
I do not by any stretch of the imagination support the view that the article is poor. In fact, I disagree with many of the things U25 has said elsewhere on this page... I'm genuinely sorry if this upset you. [U26]
Obtaining administratorship
Researchers from Carnegie Mellon University devised[19] a probit model of editors who have successfully passed the peer review process to become admins. Using only Wikipedia metadata, including the text of edit summaries, their model is 74.8% accurate in predicting successful candidates.
The paper observed that despite protestations to the contrary, "in many ways election to admin is a promotion, distinguishing an elite core group from the large mass of editors." Consequently, the paper used policy capture[20]—a method that compares nominally important attributes to those that actually lead to promotion in a work environment.
The overall success rate for promotion was 53%, dropping from 75% in 2005 to 42% in 2006 and 2007. This sudden increase in failure rate was attributed to a higher standard that recently promoted administrators had to meet, and supported by anecdotal evidence from another recent study[21] quoting some early admins who have expressed doubt that they would pass muster if their election (RfA) were held recently. In light of these developments the study argued that:
The process once called "no big deal" by the founder of Wikipedia has become a fairly big deal.
Significant factors affecting RfA outcome, numbers in parentheses are not statistically significant at p<.05:
Factor | 2006–2007 | pre–2006 |
---|---|---|
number of previous RfA attempts | -14.7% | -11.1% |
months since first edit | 0.4% | (0.2%) |
every 1000 article edits | 1.8% | (1.1%) |
every 1000 Wikipedia policy edits | 19.6% | (0.4%) |
every 1000 WikiProject edits | 17.1% | (7.2%) |
every 1000 article talk edits | 6.3% | 15.4% |
each Arb/mediation/wikiquette edit | -0.1% | -0.2% |
diversity score (see text) | 2.8% | 3.7% |
minor edits percentage | 0.2% | 0.2% |
edit summaries percentage | 0.5% | 0.4% |
"thanks" in edit summaries | 0.3% | (0.0%) |
"POV" in edit summaries | 0.1% | (0.0%) |
Admin attention/noticeboard edits | -0.1% | (0.2%) |
Contrary to expectations perhaps, "running" for administrator multiple times is detrimental to the candidate's chance of success. Each subsequent attempt has a 14.8% lower chance of success than the previous one. Length of participation in the project makes only a small contribution to the chance of a successful RfA.
Another significant finding of the paper is that one Wikipedia policy edit or WikiProject edit is worth ten article edits. A related observation is that candidates with experience in multiple areas of the site stood better chance of election. This was measured by the diversity score, a simple count of the number of areas that the editor has participated in. The paper divided Wikipedia in 16 areas: article, article talk, articles/categories/templates for deletion (XfD), (un)deletion review, etc. (see paper for full list). For instance, a user who has edited articles, her own user page, and posted once at (un)deletion review would have a diversity score of 3. Making a single edit in any additional region of Wikipedia correlated with a 2.8% increased likelihood of success in gaining administratorship.
Making minor edits also helped, although the study authors consider that this may be so because minor edits correlate with experience. In contrast, each edit to an Arbitration or Mediation committee page, or a Wikiquette notice, all of which are venues for dispute resolution, decreases the likelihood of success by 0.1%. Posting messages to administrator noticeboards had a similarly deleterious effect. The study interpreted this as evidence that editors involved in escalating or protracted conflicts lower their chances of becoming administrators.
Saying "thanks" or variations thereof in edit summaries, and pointing out point of view ("POV") issues (also only in edit summaries because the study only analyzed metadata) were of minor benefit, contributing to 0.3% and 0.1% to candidate's chances in 2006–2007, but did not reach statistical significance before.
A few factors that were found to be irrelevant or marginal at best:
- Editing user pages (including one's own) does not help. Somewhat surprisingly, user talk page edits also do not affect the likelihood of administratorship.
- Welcoming newcomers or saying "please" in edit summaries had no effect.
- Participating in consensus-building, such as RfA votes or the village pump, does not increase the likelihood of becoming admin. The study admits however that participation in consensus was measured quantitatively but not qualitatively.
- Vandal-fighting as measured by the number of edits to the vandalism noticeboard had no effect. Every thousand edits containing variations of "revert" was positively correlated (7%) with adminship for 2006–2007, but did not attain statistical significance unless one is willing to lower the threshold to p<.1). More confusingly, before 2006 the number of reverts was negatively correlated (-6.8%) with adminship success, against without attaining statistical significance even at p<.1. This may be because of the introduction of a policy known as "3RR" in 2006 to reduce reverts.[22]
The study suggests that some of the 25% unexplained variability in outcomes may be due to factors that were not measured, such as quality of edits or participation in off-site coordination, such as the (explicitly cited) secret mailing list reported in The Register.[23] The paper concludes:
Merely performing a lot of production work is insufficient for "promotion" in Wikipedia. Candidates' article edits were weak predictors of success. They also have to demonstrate more managerial behavior. Diverse experience and contributions to the development of policies and WikiProjects were stronger predictors of RfA success. This is consistent with the findings that Wikipedia is a bureaucracy[17] and that coordination work has increased substantially.[24][25] ... Participation in Wikipedia policy and WikiProjects was not predictive of adminship prior to 2006, suggesting the community as a whole is beginning to prioritize policymaking and organization experience over simple article-level coordination.
Subsequent research by another group[26] probed the sensemaking activities of individuals during their contributions to RfA decisions. This work establishes that decisions about RfA candidates is based on a shared interpretation of evidence in the wiki and histories of prior interactions.
Machine learning
Automated semantic knowledge extraction using machine learning algorithms is used to "extract machine-processable information at a relatively low complexity cost".[27] DBpedia uses structured content extracted from infoboxes by machine learning algorithms to create a resource of linked data in a Semantic Web.[28]
Wikipedia view statistics and movie revenue
In a study published in PLoS ONE[29] researchers from Oxford Internet Institute and Central European University have shown that the page view statistics of articles about movies are well correlated with the box office revenue of them. They developed a mathematical model to predict the box office takings by analysing the page view counts as well as number of edits and unique editors of the Wikipedia pages on movies.[30][31] In a related work published in Scientific Reports in 2013,[32] Helen Susannah Moat, Tobias Preis and colleagues demonstrated a link between changes in the number of views of Wikipedia articles relating to financial topics and subsequent large stock market moves.[33][34]
References
- ^ S - tuckman, Jeff; Purtilo, James (2009). "Measuring the wikisphere". Proceedings of the 5th International Symposium on Wikis and Open Collaboration. WikiSym '09: 1. doi:10.1145/1641309.1641326. ISBN 978-1-60558-730-1.
- ^ Priedhorsky, Reid; Chen, Jilin; Lam, Snider (Tony); Panciera, Katherine; Terveen, Loren; Austin, Shane (2007). "Creating, Destroying, and Restoring Value in Wikipedia". Proceedings of the 2007 international ACM conference on Supporting group work. Conference on Supporting Group Work. ACM Press. pp. 259–268. doi:10.1145/1316624.1316663. ISBN 978-1-59593-845-9.
{{cite conference}}
: External link in
(help); Unknown parameter|conferenceurl=
|booktitle=
ignored (|book-title=
suggested) (help); Unknown parameter|conferenceurl=
ignored (|conference-url=
suggested) (help) - ^ Baker, Nicholson (2008-04-10). "How I fell in love with Wikipedia". The Guardian. Retrieved 2010-11-29.
- ^ Chi, Ed; Kittur, Aniket; Pendleton, Bryan A.; Suh, Bongwon; Mytkowicz, Todd (2007-01-31). "Power of the Few vs. Wisdom of the Crowd: Wikipedia and the Rise of the Bourgeoisie" (PDF). Computer/Human Interaction 2007 Conference. Association for Computing Machinery. Retrieved 2017-04-23.
{{cite web}}
: Unknown parameter|last-author-amp=
ignored (|name-list-style=
suggested) (help) - ^ Goodwin, Jean. (2010). The authority of Wikipedia. In Juho Ritola (Ed.), Argument cultures: Proceedings of the Ontario Society for the Study of Argumentation Conference. Windsor, ON, Canada: Ontario Society for the Study of Argumentation.CD-ROM.24 pp.
- ^ Graham, Mark (2009-11-12). "Mapping the Geographies of Wikipedia Content". Mark Graham: Blog. ZeroGeography. Retrieved 2009-11-16.
- ^ Gabrilovich, Evgeniy; Markovitch, Shaul (2007). "Computing Semantic Relatedness using Wikipedia-based Explicit Semantic Analysis". Proceedings of IJCAI. Morgan Kaufmann Publishers Inc. pp. 1606–1611. CiteSeerX 10.1.1.76.9790.
{{cite conference}}
: Unknown parameter|booktitle=
ignored (|book-title=
suggested) (help) - ^ Zesch, Torsten; Müller, Christoph; Gurevych, Iryna (2008). "Extracting Lexical Semantic Knowledge from Wikipedia and Wiktionary" (PDF). Proceedings of the Conference on Language Resources and Evaluation (LREC).
{{cite conference}}
: Unknown parameter|booktitle=
ignored (|book-title=
suggested) (help) - ^ M Strube; SP Ponzetto (2006). "WikiRelate! Computing semantic relatedness using Wikipedia psu.edu" (PDF). Proceedings of the National Conference.
{{cite journal}}
: Cite journal requires|journal=
(help) - ^ Laurent, M. R.; Vickers, T. J. (2009). "Seeking Health Information Online: Does Wikipedia Matter?". Journal of the American Medical Informatics Association. 16 (4): 471–479. doi:10.1197/jamia.M3059. PMC 2705249. PMID 19390105.
- ^ Heilman, JM; Kemmann, E; Bonert, M; Chatterjee, A; Ragar, B; Beards, GM; Iberri, DJ; Harvey, M; Thomas, B; Stomp, W; Martone, MF; Lodge, DJ; Vondracek, A; de Wolff, JF; Liber, C; Grover, SC; Vickers, TJ; Meskó, B; Laurent, MR (Jan 31, 2011). "Wikipedia: a key tool for global public health promotion". Journal of medical Internet research. 13 (1): e14. doi:10.2196/jmir.1589. PMC 3221335. PMID 21282098.
{{cite journal}}
: CS1 maint: unflagged free DOI (link) - ^ Bill Tancer (2007-04-25). "Who's Really Participating in Web 2.0". Time Magazine. Archived from the original on 30 April 2007. Retrieved 2007-04-30.
{{cite news}}
: Italic or bold markup not allowed in:|publisher=
(help); Unknown parameter|deadurl=
ignored (|url-status=
suggested) (help) - ^ Sumi, R.; Yasseri, T.; Rung, A.; Kornai, A.; Kertesz, J. (1 October 2011). "Edit Wars in Wikipedia": 724–727. doi:10.1109/PASSAT/SocialCom.2011.47 – via IEEE Xplore.
{{cite journal}}
: Cite journal requires|journal=
(help) - ^ "Dynamics of Conflicts in Wikipedia". PLoS ONE. 7: e38869. arXiv:1202.3643. Bibcode:2012PLoSO...738869Y. doi:10.1371/journal.pone.0038869.
{{cite journal}}
: CS1 maint: unflagged free DOI (link) - ^ Török, J; Iñiguez, G; Yasseri, T; San Miguel, M; Kaski, K; Kertész, J. "Opinions, Conflicts, and Consensus: Modeling Social Dynamics in a Collaborative Environment". Physical Review Letters. 110: 088701. arXiv:1207.4914. Bibcode:2013PhRvL.110h8701T. doi:10.1103/PhysRevLett.110.088701. PMID 23473207.
- ^ Yasseri T.; Spoerri A.; Graham M.; Kertész J (2014). "The most controversial topics in Wikipedia: A multilingual and geographical analysis" (PDF). Scarecrow Press. Retrieved 4 February 2014.
- ^ a b Butler, Brian; Joyce, Elisabeth; Pike, Jacqueline (2008). "Don't look now, but we've created a bureaucracy". Proceeding of the twenty-sixth annual CHI conference on Human factors in computing systems - CHI '08: 1101. doi:10.1145/1357054.1357227.
- ^ Kriplean, Travis; Beschastnikh, Ivan; McDonald, David W.; Golder, Scott A. (2007). "Community, consensus, coercion, control". Proceedings of the 2007 international ACM conference on Conference on supporting group work - GROUP '07: 167. doi:10.1145/1316624.1316648.
- ^ Burke, Moira; Kraut, Robert (2008). "Taking up the mop". Proceeding of the twenty-sixth annual CHI conference extended abstracts on Human factors in computing systems - CHI '08: 3441. doi:10.1145/1358628.1358871.
- ^ Stumpf, S. A.; London, M. (1981). "Capturing rater policies in evaluating candidates for promotion". The Academy of Management Journal. 24 (4): 752–766. doi:10.2307/256174.
- ^ Forte, A., and Bruckman, A. Scaling consensus: Increasing decentralization in Wikipedia governance. Proc. HICSS 2008.
- ^ WP:3RR and WP:EW, polices which prevent repetitive reverting.
- ^ Metz, Cade. "Secret mailing list rocks Wikipedia". The Register.
- ^ Kittur, Aniket; Suh, Bongwon; Pendleton, Bryan A.; Chi, Ed H. (2007). "He says, she says: conflict and coordination in Wikipedia". Proceedings of the SIGCHI conference on Human factors in computing. Association for Computing Machinery: 453–462. doi:10.1145/1240624.1240698. ISBN 978-1-59593-593-9.
- ^ Viegas, Fernanda B.; Wattenberg, Martin; Kriss, Jesse; van Ham, Frank (2007). "Talk Before You Type: Coordination in Wikipedia". 40th Annual Hawaii International Conference on System Sciences. IEEE Xplore Digital Library: 575–582. doi:10.1109/HICSS.2007.511.
- ^ Derthick, K., P. Tsao, T. Kriplean, A. Borning, M. Zachry, and D. W. McDonald (2011). Collaborative Sensemaking during Admin Permission Granting in Wikipedia. In A.A. Ozok and P. Zaphiris (Eds.): Online Communities, HCII 2011, LNCS 6778, pp. 100–109.
- ^ Baeza-Yates, Ricardo; King, Irwin, eds. (2009). Weaving services and people on the World Wide Web. Springer. ISBN 9783642005695. LCCN 2009926100.
- ^ Yu, Liyang (2011). A Developer's Guide to the Semantic Web. Springer. doi:10.1007/978-3-642-15970-1. ISBN 9783642159695.
- ^ Márton Mestyán; Taha Yasseri; János Kertész (2013). "Early Prediction of Movie Box Office Success Based on Wikipedia Activity Big Data". PLoS ONE. 8: e71226. arXiv:1211.0970. Bibcode:2013PLoSO...871226M. doi:10.1371/journal.pone.0071226. PMC 3749192. PMID 23990938.
{{cite journal}}
: CS1 maint: unflagged free DOI (link) - ^ "Wikipedia buzz predicts blockbuster movies' takings weeks before release". The Guardian. Nov 8, 2012. Retrieved Sep 2, 2013.
- ^ "Using Wikipedia To Predict The Box Office Of A Movie". Forbes. Nov 9, 2012. Retrieved Sep 2, 2013.
- ^ Helen Susannah Moat; Chester Curme; Adam Avakian; Dror Y. Kenett; H. Eugene Stanley; Tobias Preis (2013). "Quantifying Wikipedia Usage Patterns Before Stock Market Moves". Scientific Reports. 3: 1801. Bibcode:2013NatSR...3E1801M. doi:10.1038/srep01801.
- ^ "Wikipedia's crystal ball". Financial Times. May 10, 2013. Retrieved August 10, 2013.
- ^ Kadhim Shubber (May 8, 2013). "Wikipedia page views could predict stock market changes". Wired.com. Retrieved August 10, 2013.
Further reading
- Adler, B.T.; de Alfaro, L. (2007). "A content-driven reputation system for the Wikipedia". Proceedings of the 16th international conference on World Wide Web. New York: ACM. pp. 261–270. doi:10.1145/1242572.1242608. ISBN 978-1-59593-654-7.
- Amichai-Hamburger, Y.; Lamdan, N.; Madiel, R.; Hayat, T. (2008). "Personality characteristics of Wikipedia members". Cyberpsychology & Behavior. 11 (6): 679–681. doi:10.1089/cpb.2007.0225.
- Blumenstock, J. E. (2008). "Size matters: word count as a measure of quality on Wikipedia". Proceedings of the 17th international conference on World Wide Web. New York: ACM. pp. 1095–1096. doi:10.1145/1367497.1367673. ISBN 978-1-60558-085-2.
- Bryant, S. L.; Forte, A.; Bruckman, A. (2005). "Becoming Wikipedian: transformation of participation in a collaborative online encyclopedia". GROUP '05 Proceedings of the 2005 international ACM SIGGROUP conference on Supporting group work. New York: ACM. doi:10.1145/1099203.1099205. ISBN 1-59593-223-2.
- Farrell, H.; Schwartzberg, M. (2008). "Norms, Minorities, and Collective Choice Online". Ethics & International Affairs. 22 (4). Carnegie Council for Ethics in International Affairs. Archived from the original on 20 January 2009. Retrieved 2009-02-03.
{{cite journal}}
: Unknown parameter|deadurl=
ignored (|url-status=
suggested) (help) - Hu, M.; Lim, E.-P.; Sun, A.; Lauw, H. W.; Vuong, B.-Q. (2007). "Measuring article quality in Wikipedia: models and evaluation". Proceedings of the sixteenth ACM conference on Conference on information and knowledge management. New York: ACM. doi:10.1145/1321440.1321476. ISBN 978-1-59593-803-9.
- Jensen, R. (2012). "Military History on the Electronic Frontier: Wikipedia Fights the War of 1812" (PDF). Journal of Military History. 76 (4): 523–556.
- Kuznetsov, S. (2006). "Motivations of contributors to Wikipedia". ACM SIGCAS Computers and Society. 36 (2): 1. doi:10.1145/1215942.1215943.
- Luyt, B.; Aaron, T. C. H.; Thian, L. H.; Hong, C. K. (2008). "Improving Wikipedia's accuracy: Is edit age a solution?". Journal of the American Society for Information Science and Technology. 59 (2): 318–330. doi:10.1002/asi.20755.
- Medelyan, O.; Milne, D.; Legg, C.; Witten, I. H. (2008). "Mining Meaning from Wikipedia". arXiv:0809.4530 [cs.AI].
- Park, T. K. (2011). "The visibility of Wikipedia in scholarly publications". First Monday. 16 (8). doi:10.5210/fm.v16i8.3492.
{{cite journal}}
: CS1 maint: unflagged free DOI (link) - van Pinxteren, B. (2017). "African Languages in Wikipedia – A Glass Half Full or Half Empty?". POLITICAL ECONOMY - DEVELOPMENT: COMPARATIVE REGIONAL ECONOMIES eJOURNAL. 5 (12). SSRN 2939146.
- Shachaf, P. (2009). "The paradox of expertise: Is the Wikipedia reference desk as good as your library?". Journal of Documentation. 65 (6): 977–996. doi:10.1108/00220410910998951.
- Shachaf, P.; Hara, N. (2010). "Beyond vandalism: Wikipedia trolls". Journal of Information Science. 36 (3): 357–370. doi:10.1177/0165551510365390.
- Stein, K.; Hess, C. (2007). "Does it matter who contributes: a study on featured articles in the German Wikipedia". Proceedings of the eighteenth conference on Hypertext and hypermedia. New York: ACM. doi:10.1145/1286240.1286290. ISBN 978-1-59593-820-6.
- Suh, B.; Chi, E. H.; Kittur, A.; Pendleton, B. A. (2008). "Lifting the veil: improving accountability and social transparency in Wikipedia with wikidashboard". Proceeding of the twenty-sixth annual SIGCHI conference on Human factors in computing systems. Association for Computing Machinery: 1037. doi:10.1145/1357054.1357214. ISBN 978-1-60558-011-1.
- Urdaneta, G.; Pierre, G.; van Steen, M. (2009). "Wikipedia Workload Analysis for Decentralized Hosting". Computer Networks. 53 (11): 1830–1845. doi:10.1016/j.comnet.2009.02.019.
- Vuong, B.-Q.; Lim, E.-P.; Sun, A.; Le, M.-T.; Lauw, H. W.; Chang, K. (2008). "On ranking controversies in Wikipedia: models and evaluation". Proceedings of the 2008 International Conference on Web Search and Data Mining. New York: ACM. doi:10.1145/1341531.1341556. ISBN 978-1-59593-927-2.
External links
- WikiPapers – a compilation of resources (conference papers, journal articles, theses, books, datasets and tools) focused on the research of wikis and Wikipedia