User:Filll/Controversial Article Project/Calculation

Ideas

episodic, quiescence, senscence, editor fatigue
centrality mean mode median others
T editor overturn temporal constant (rate of new editor appearance)
S sock puppets false positives, false negatives receiver operating curve
R article dependent temporal scale circadian and weekly rhythms, event driven secular changes modulated by hardware limitations, edit conflicts
relative expected values of I as a function of time

Proposed metric

There are many types of editing on Wikipedia, even many types of editing that are associated with articles. The two crudest classifications of editing articles might be classified as article building and article defense.

Article building involves edits that create article content. These edits can appear on both the article mainspace page and on the article talk page, but will appear on the mainspace page for the most part. Frequently some aspect of article construction will take place in sandboxes or offwiki, and it becomes tougher to capture the contribution of article building in sandboxes and in particular, offwiki article construction.

Article defense includes things like vandalism patrol, and debating and discussing the article content on the talk page and other pages, such as user pages. The number of reverts to a mainspace page can provide information about antivandalism efforts as well as capturing edit warring, which is also associated with the "debate and discussion" activity on a page. The number of reverts to talk pages can also provide information about vandalism activity, particularly reverts of talk page blanking. Some reverts of small sections of talk page edits are indications of responses to tendentious editing.

A controversial article will tend to be characterized by a greater ratio of article defense activities to article building activities than a less controversial article. Let C, the controversy ratio, be defined as the ratio of article defense activity to article building activity. Articles that are locked for any appreciable amount of time will have correspondingly larger values of C, since less building activity is able to take place because of the editing restiction, while the defense, debate and discussion activity can continue unabated. Of course, if articles are locked for an extended period of time, interest can wane in the discussion and the debate and discussion will probably eventually dissipate.

To roughly approximate the controversy ratio C, let

D= number of talk page edits per month

B= number of main page edits per month

and define

C0=D/B = ratio telling how controversial article is, roughly as an estimate of the controversy ratio C.

If Rv is the number of reverts to the mainspace page, then a better estimate of the controversy ratio C is

C1=(D+Rv)/(B-Rv)

An even more accurate estimate of C would be

C2=(D+Rv-S)/(B-Rv+S)

where S are the number of constructive suggestions for article improvements placed on the talk page. One way to estimate S might be to sample some talk pages of varous kinds and estimate the ratios S/B and S/D which initially can be assumed to be constant, but could be allowed to differ for different categories of article (for example, assuming a dependence of S on C).

Obviously, considerably more sophisticated estimates of C are possible. And clearly, better methods for distinguishing controversial from noncontroversial articles are possible. However, a few examples demonstrate that C0 is a reasonable statistic for distinguishing controversial from less contentious articles. For example, consider following articles and the associated values of C0:

Trying to build a controversial article is more difficult than trying to build a noncontroversial article because many of the edits are changed or reverted away from their original intent. These edits have to be repeated, or their exact form debated. Therefore, article building efforts on noncontroversial articles are expected to be more effective and productive than article building efforts on controversial articles. The ratio C can be used to get some indication of how productive an edit to build an article is. Define the efficiency E of a building edit as

Efficiency E=100%/(C+1)

Ideally, for a noncontroversial article, C=0 and E=100%. If C=1, then for every building edit, there is a defensive edit, and about half the activity on the article is expected to be productive. If C=1, the efficiency E is 50%. If C=2, the efficiency E is about 33.3% and if C=3, the efficiency is only 25%.

Of course, the controversy ratio C is expected to be temporally dependent, and dependent on article quality as well. The article rating may be taken as a proxy for article quality. It is conjectured that the rate of administrative intervention on controversial articles is higher than on noncontroversial articles. Therefore, some measure of administrative intervention, including page locks, article content RfCs, content RfCs for the most active article participants, Arbcomm proceedings for article contributors and articles, mediation proceedings, Administrator's noticeboard threads and other noticeboard discussions should be correlated with the controversy ratio C.

A given editor will make both building edits to an article, and defense edits to an article. Their defense edits are valuable as defense, but the value of their building edits is eroded by the amount of controversy associated with the article. A reasonable ansatz for an expression that captures this is to multiply the number of mainspace edits of an editor by the factor 1/(1+C). Reducing the number of mainspace edits by the factor 1/(1+C) attempts to capture the devaluation of mainspace edits in a controversial article. If the article is noncontroversial, C=0 and all mainspace edits are effective, so that productivity is high. As C increases, fewer of the mainspace edits are effective and productivity is lowered accordingly.

Editors put most of their time and energy into a few articles. There are of course articles that an editor adds one or two edits to, fixing a spelling mistake or a punctuation error, but these do not determine much about the orientation of the editor. The articles on which the editor spends most of his or her time and compiles the largest number of edits are more useful for determining editor orientation. A simple first step towards quantifying editor activity is to examine the average value of the controversy ratios of the articles a given editor has edited most often.

Editor article activity can take place both on the mainspace page and on the talk page, so most generally an editor will be most active on talk pages and mainspace pages that are different. The average controversy ratios for greatest mainspace activity and greatest talk page activity can be useful information to show which articles the editor chooses to devote their resources to.

Just because an editor has put more effort into a given article does not capture the amount of intensity of this effort. It is important to provide some measure of effort intensity to distinguish different editors with different styles. An expression that tries to describe the intensity of editor building activity, corrected for productivity, is

Ib=b*f/(1+C0)

where b is the monthly average mainspace edits to an article, and f is the fraction of mainspace edits due to the given editor. An expression that is directed towards capturing the intensity of editor defense activity on an article is

Id= d*f

where d is the monthly average talk page edits and f is the fraction of talk page edits that the editor is responsible for.

A vector valued function can be created for both these, with separate components for each article the editor contributes to. A potentially useful summary statistic is produced by averaging the values of Ib and Id for the editor over the articles they have devoted the most effort to. For example, some arbitrary limits for the number of mainspace pages and talk pages to consider can be imposed and an average of Ib and Id for an editor formed.

Editor examples

User:Filll has the most edits on the mainspace pages of Frère Jacques, Introduction to evolution, Translations of Frère Jacques, Level of support for evolution, and Hot spring (ordered by number of edits), which have estimated controversy ratios C0 of 0.26, 0.76, 0.18, 1.39 and 0.6 respectively. The sum of these 5 estimated controversy ratios is 3.19. Similarly, User: Filll has the most edits on the talk pages of Black people, Intelligent design, Evolution Homeopathy, and Expelled: No Intelligence Allowed, again ordered by decreasing number of edits. The estimated controversy ratios of these articles are 1.02, 1.60, 1.19, 1.81 and 1.72 respectively. The sum of these estimated controversy ratios is 7.34.

The suggested measure of building intensity, corrected for productivity for the five articles which User:Filll has contributed most mainspace edits to is 8.69, 12.67, 15.67, 7.94, and 3.51 corrected edits per month, given in order of decreasing number of edits.

Filll
- - Top 5 Mainspace
- Frère Jacques 19, 1, 403/699 (6.3, 1, 96/181)
  - 0.26

- Introduction to evolution 135.6, 2, 372/2262 (104.1, 2, 274/1709)
  - 0.76

- Translations of Frère Jacques 29.9, 1 355/574 (6, 1, 56/104)
  - 0.18

- Level of support for evolution 47.7, 1, 296/735 (65.2, 1, 298/1025)
  - 1.39

- Hot spring 9.2, 1, 259/640 (0.9, 1, 17/39)
  - 0.06

- - Top 5 Talk page
- Black people 84.2, 28, 36/6308 (99.7, 1, 635/6416)

- - 1.02

- Intelligent design 141.1, 30, 45/10693 (227.9, 5, 626/17073 )

- - 1.60

- Evolution 147.1, 35, 37/11116 (177.2, 3, 565/13270)

- - 1.19

- Homeopathy 87.2, 4, 219/6548 ( 157.5, 4, 523/11823 )

- - 1.81
- Expelled: No Intelligence Allowed 239.1, 1, 224/1885 (468.8, 1, 462/3233)
  - 1.72

Articles to look at

Type C

Homeopathy 6548, 1246, 87.2 (11823, 530, 157.5)
- 1.81
Depleted Uranium 2080, 643, 27.5 (1483, 216, 22.9)
- 0.71

9/11 conspiracy theories 9009, 2392, 210.8 (5501, 624, 132.1)
- 0.61

Evolution 11116, 3128, 147.1 (13270, 1265, 177.2)

- 1.94

Intelligent Design 10693, 2542, 141.1, (17076, 746, 228.0)
- 1.60

Black people 6312, 1526, 84.3 (6416, 473, 99.7)
- 1.02

What the Bleep Do We Know!? 1953, 561, 43.1 (3987, 219, 87.5)
- 2.04

Electronic voice phenomenon 2949, 442, 63.8 (4207, 137, 91.6)
- 1.43

Type NC

Cardiology 369 edits, 200 editors, 4.5/month, (21, 12, 0.4)
- 0.06

Sunflower 1150, 647, 16.2 (69,47, 1.1)
- 0.06

Plastic 1990, 1077, 26.6 (98, 61, 1.9)
- 0.05

Saint Pierre and Miquelon 954, 416, 12.8 (140, 44, 2.6)
- 0.15

Isle of Wight 1642, 598, 21.9 (236, 65, 4.3)
- 0.14

Cattle 4661, 2282, 62.2 (526, 284, 7.7)
- 0.11

Data to collect

Total edits
Total months
Highest 6 months edits, 12 months

Users to study

Type C

Durova
Raul654
JzG
MONGO
Orangemarlin
Filll
- - Top 5 Mainspace
- Frère Jacques 19, 1, 403/699 (6.3, 1, 96/181)
- Introduction to evolution 135.6, 2, 372/2262 (104.1, 2, 274/1709)
- Translations of Frère Jacques 29.9, 1 355/574 (6, 1, 56/104)
- Level of support for evolution 47.7, 1, 296/735 (65.2, 1, 298/1025)
- Hot spring 9.2, 1, 259/640 (0.9, 1, 17/39)

- - Top 5 Talk page
- Black people 84.2, 28, 36/6308 (99.7, 1, 635/6416)
- Intelligent design 141.1, 30, 45/10693 (227.9, 5, 626/17073 )
- Evolution 147.1, 35, 37/11116 (177.2, 3, 565/13270)
- Homeopathy 87.2, 4, 219/6548 ( 157.5, 4, 523/11823 )
- Expelled: No Intelligence Allowed 239.1, 1, 224/1885 (468.8, 1, 462/3233)

Dave souza
Hrafn

Type NC

DGG
- Open access, 14 per month, ranked number 1, 66/747 (2 per month, ranked number 1, 19/71)
- E-book, 19.3 per month, ranked number 1, 61/1199 (2.5 per month, ranked number 1, 19/54)
- Printing press 21.7 per month, ranked number 2, 51/1627 (2.3 per month, ranked number 1, 27/184)
- Johannes Gutenberg 27.6 per month, ranked number 3, 46/2063 (4.2 ranked 2, 57/316)
- Movable type 5.3 per month, ranked number 3, 30/393 (6.4 ranked 4, 19/111)
- Phage therapy 6.2 per month, ranked number 2, 28/292

(1.8 per month, ranked number 1, 13/44)

- 42 Johannes Gutenberg 27.6 per month, ranked number 3, 46/2063 (4.2 ranked 2, 57/316)
- 27 Printing press 21.7 per month, ranked number 2, 51/1627 (2.3 per month, ranked number 1, 27/184)
- 23 Joseph Schlessinger 89.2, 4, 20/414 (29.5, 4, 23/173)
- Movable type 5.3 per month, ranked number 3, 30/393 (6.4 ranked 4, 19/111)

- E-book 19.3 per month, ranked number 1, 61/1199 (2.5 per month, ranked number 1, 19/54)

GTBacchus

- 115 Cocaine
- 91 The Beatles
- 86 Hipster (1940s subculture)
- 63 Lysergic acid diethylamide
- 58 Christmas controversy

- 143 Abortion
- 63 Criticism of Wikipedia
- 52 Abortion/First paragraph
- 46 Christmas controversy
- 40 Iraq War

Tim Vickers

LaraLove
WillowW
- 682 Encyclopædia Britannica
- 511 Equipartition theorem
- 417 Action potential
- 322 Photon
- 320 X-ray crystallography

- 135 Photon
- 69 Introduction to general relativity
- 64 Harold Pinter
- 62 Action potential
- 56 Encyclopædia Britannica

Data to collect

Most controversial articles edited
rank out of editors on these controversial articles

Single purpose accounts

A common phenomenon observed on Wikipedia is the existence of "Single purpose accounts". Single purpose accounts that edit only one or two kinds of article on one or two subjects (which can be determined by estimating distances in the category graph), can be productive if they are involved mainly in building activities. Single purpose accounts that are involved significantly in "debate and discussion", including reverting activity, on controversial articles probably are more disruptive than productive. Using these metrics it should be possible to distinguish "Single purpose accounts" from other accounts automatically and readily, and to distinguish productive from disruptive single purpose accounts.

Temporal dependence

Clearly, both article editing and editor activity exhibit episodic behavior and other temporal dependence. For example, there are obvious and expected circadian and hebdomadal patterns as well as seasonal fluctuations in editing. Holiday periods and early morning hours for the largest population of contributors are likely to exhibit different editing activity than other times. Some articles have distinct episodic editing patterns, like Translations of Frère Jacques, which experienced a burst of editing activity in its second month, and has exhibited relative quiescence since then for well over a year. Other articles, like Expelled: No Intelligence Allowed experience a secular increase in editing activity associated with some real world event, like the release of the associated film. Still other articles like Introduction to evolution experience sharp increases in editing activity associated with article creation and the efforts associated with ratings improvement (the two sharp jumps in editing in September of 2007 and December 2007/January 2008 were associated with a GA and an FA application, respectively). Of course, locking an article can also create strong temporal dependence in article editing data.

In the case of strongly episodic data, other measures of centrality such as the median and mode are probably more appropriate than the arithmetic mean. Depending on the distribution involved, even more exotic measures of centrality such as the harmonic and the geometric mean might be appropriate, or nonparametric measures such as the jackknife and bootstrap.

More fine-grained descriptions of editing activity can be useful for investigating other issues. For example, time series of edits with a daily sampling period, or even more frequent, would be valuable to see the effects on the article summary statistics such as the controversy ratio C, or building and defense rates b and d. These fine-grained statistics can be created using a moving average or other low pass filtered version of the raw data. Removal of harmonics to prewhiten the data can reduce spurious correlation bias.

It has been claimed that some editors are sufficiently skilled at conflict resolution and consensus building that their talk page contributions are far more efficient than those of typical users. Hopefully mediators would also exhibit this purported skill. It could be conjectured that various administrative actions such as Wikipedia:Article probation or sanctioning of disruptive editors or achievement of GA or FA status can all affect the controversy ratio C. These can all be investigated using more frequently sampled editing data.

It is to be expected that in the case of removal of disruptive editors or a short term contribution of an editor with special consensus building skills might decrease the controversy ratio C, temporally correlated with event onset, and C might experience a relaxation or reversion to the mean with a characteristic "time" constant. Hypothetically, an "editor overturn time constant" might depend in some simple way on the rate at which new editors appear on a page. If there are new editors that appear every day or two, something done to calm the waters a few days previously might be already forgotten and archived, and the effect of the intervention might not last long.

The relevant time scale is probably dependent on article editing activity. Articles and the associated talk pages receiving a low number of edits evolve very slowly, with a new edit every few weeks or months. More aggressively edited articles can exhibit 100 kilobytes or more of new talk page edits per day. The only limits in some cases are the limitations of the hardware, leading to edit conflicts.

Editor overturn time constant

Clearly the rate of new editor appearance at an article is biased upwards by the presence of sock puppets and meat puppets. Therefore the raw data should be corrected for this effect. We use some techniques for distinguishing sock pupets, but of course this algorithm has a receiver operating curve and exhibits type I and type II errors, that is, false positives and false negatives. To correct for this effect, one needs to estimate something like the Blackstone ratio, so the probability of a given new editor being a sock puppet can be estimated.