Wikipedia:About Valid Routine Calculations

From Wikipedia, the free encyclopedia
Jump to: navigation, search

The policy WP:CALC express, in general terms, when an editor can add some calculations (to an article) that are not in the sources, but are an obvious and acceptable interpretation of the source data:

Routine calculations do not count as original research. Basic arithmetic, such as adding numbers, converting units, or calculating a person's age, is allowed provided there is consensus among editors that the calculation is an obvious, correct, and meaningful reflection of the sources.

This essay supports both, a detailed review of "Routine calculations" at Wikipedia, and a natural extension to this policy:

The recursive use of routine calculations, such as summation, products of sequences, or the calculation of averages, also do not count as original research, when interpreted by the article's reader as a summary of numerical data — i.e. when used for well-known (and consensual) forms of "numerical synthesis". In this context, the synthesis of numerical data is not original research by synthesis.

Working definitions and examples[edit]

More formal and detailed definitions, to support Wikipedians that are discussing the policy WP:CALC and similar ones.

Routine calculations[edit]

A routine calculation consists of

Any of these formulas must applied into a valid context of units, dimension, precision, etc. and are valid as "routine calculation (WP:CALC)" when there is (implicit or discussed) consensus among editors that the calculation is an obvious, correct, and meaningful reflection of the sources.

Round and precision[edit]

Using automatic conversion, it is easy and secure to convert units, like "5300 meters" to feet by the template:Convert/lengthcalc "{{Convert/lengthcalc|5300}}", that results in "{{Convert/lengthcalc|5300}}". But, unfortunately, the work does not stop here. Quality work takes into account context and interprets precision: "5300 meters" often means something like "5300±100 meters", so the correct translation is "5300 m (17,400 ft)".

Numerical treatment and representation[edit]

foo
  
30%
bar
  
40%
baz
  
20%
bla
  
8%
bla
  
1/50
Illustration. See Template:Bar_box.

When sources have complex or technical data, and editors need to reproduce data at Wikipedia for normal people, some numerical treatment can be made. Examples:

  • Use of error propagation rules when applying "routine calculation" on quantities with error values (ex. multiply 4.26 ± 0.02 by two).
  • Recursive use of routine calculation over a list of numbers.
  • Graphic representation of data.
  • ...

Summary of numerical data[edit]

The repeated (recursive) use of routine operations, such as summation, "products of sequences", or the calculation of averages, are valid when used for well-known (and consensual) forms of "numerical synthesis", and can be interpreted by the article's reader as summaries of numerical data. Example:

quant. A quant. B Perc. of A Diff.
20 123 16.3%
40 234 17.1% 0.8%
55 300 18.3% 1.2%
Total:
115
Total:
657
Average:
17.2%
Average:
1.0%

(without background) Source data
(this background) Calculated by an editor
(this background) Summarized by an editor

The table above illustrates an encyclopedic issue produced with source data and NOTOR Simple calculations. It "translates and synthesizes" the source data, with accuracy and neutral point of view; preserving "the truth" of the source. A "new truth" can be produced by some statistical methods, such when interpreting an average as an expected value.

Recommendations[edit]

Guidance kernel. It is not a Wikipedia policy or guideline, though it may be consulted for assistance and advice.

Routine calculations[edit]

General rules: in case of doubt about two or more alternatives for calculations, prefer Wikipedia-tradition-of-use to non-traditional-uses, a template-made to a hand-made, a standard to a non-standard, simple to complex. The table below shows contexts and methods for routine calculations. Each method is an editor's option for calculation, or strategy to express the result of your calculations.

Method Preference Description Example/Result Comments
Usual in unit conversion and percentages
  • "were 120, out of a total of 200".
  • "5300 meters".
Examples of source's fragment of text, and a fragment of data, that will be converted to different result by each method.
Appended Preferred The numerical data from the source is copied directly with an added conversion near to it.
  • "were 120 of 200 (60%)".
  • "5300 m (17,400 ft)".
May be too wordy in certain circumstances.
Loss-less replacement Alternate The original data is replaced by the converted data, but without lost information.
  • "60% of 200".
  • "17,400 ft"
Care must be taken to ensure both the correct number of significant digits and that translation does not misrepresent source
Lossy replacement Discouraged The original data is replaced by the converted data, but with lost information.
  • "60%".
  • "17,400 ft".
Used only in special cases, when numerical data is less important (and/or less precise) and a compact expression (less wordy) is preferred.
Usual in multiple sources with different values Source-1: 1500; source-2: 1600; source-3: 2000. Example of three sources offering three different data, that will be described differently by each method.


Record all values Preferred All of the values are recorded with their references "various experiments have measured the value as 1500[1], 1600[2], or 2000[3]" Generally not recommended for large number of sources or when sources are unequal in merit
Record Range Alternate Only the maximum and minimum values are recorded and all of the sources are "experimental values range from 1500 to 2000 [1-3]" May be preferred for cases with large numbers of sources and where range is representative. May conflict with Original Research if possibility of unequal weighting or outliers skewing data.
Record Average with reliability range Discouraged The average is recorded along with a standard deviation or standard deviation of the mean. "experimental values average at 1700±300 [1-3]" In most if not all cases this is original research since the relative weight of each value is needed. Also need to be explicit between standard deviation and standard deviation of mean. Is difficult to keep current if another source is added.
Record Average without reliability range Discouraged The unweighted average of the source data is recorded "experimental values average at ~1700 [1][2][3]" May be slightly more honest than average with reliability range in that no claim is made about reliability but still requires some knowledge about relative weights of sources. Difficult to keep current when sources are added, removed, found to be redundant, etc.

Customary justifications for "valid routine calculations". It is not a full systematic list, but can be used as reference and to inductive reasoning:

  • "Real time" calculations:
    • Age calculation. Use subtraction of dates, currentDate-bornDate. Example: source say "born in 2001" and Wikipedia write (by templating, calculates currentDate-bornDate=2013-2001=12) the age today, "12 years old". See Template:Age.
    • (... any other? ...)
  • ...

Round, precision and other treatment procedures[edit]

...

Summary of numerical data[edit]

...

See also[edit]

Policies:

  • "original research by synthesis" (WP:SYNTH)
  • ...

Templates:

Essays:

Discussions: [1], [2], [3], [4], [5], [6], [7], [8]