Talk:History of statistics

From Wikipedia, the free encyclopedia
Jump to: navigation, search
          This article is of interest to the following WikiProjects:
WikiProject Statistics (Rated C-class, Top-importance)
WikiProject icon

This article is within the scope of the WikiProject Statistics, a collaborative effort to improve the coverage of statistics on Wikipedia. If you would like to participate, please visit the project page or join the discussion.

C-Class article C  This article has been rated as C-Class on the quality scale.
 Top  This article has been rated as Top-importance on the importance scale.
WikiProject Mathematics (Rated C-class, High-importance)
WikiProject Mathematics
This article is within the scope of WikiProject Mathematics, a collaborative effort to improve the coverage of Mathematics on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
Mathematics rating:
C Class
High Importance
 Field: Probability and statistics (historical)
WikiProject History of Science (Rated C-class, Mid-importance)
WikiProject icon This article is part of the History of Science WikiProject, an attempt to improve and organize the history of science content on Wikipedia. If you would like to participate, you can edit the article attached to this page, or visit the project page, where you can join the project and/or contribute to the discussion. You can also help with the History of Science Collaboration of the Month.
C-Class article C  This article has been rated as C-Class on the project's quality scale.
 Mid  This article has been rated as Mid-importance on the project's importance scale.

Three greatest statisticians, according to the ASA[edit]

Consider the proposition that the ASA had named Deming, Fisher, and Rao as the three greatest statisticians of all time, which has for some time been linked to an ASA webpage about Rao. Editor Melcombe gently asked for a reference, and politely noted that the cited source made no such statement. I confirmed his judgement that the citation made no such statement, and less gently removed it. Now, the claim has been reinserted with a link to the same page. Would the other editor(s) please provide an proper citation? Thanks, Kiefer.Wolfowitz (talk) 19:53, 23 February 2010 (UTC)

Slightly biased toward Bayesian Statistics?[edit]

Given the greater influence of frequentist null hypothesis significance testing in fields such as psychology and neuroscience, I find it strange that the history of Bayesian statistics takes up a greater part of the article. Although I fully acknowledge the importance of Bayesian methods, I think that emphasis could be placed more on the development on - say - the t-distribution and some of the first uses of the mean, perhaps even some of the later history including the works of Snedecor, Tukey, and Wilcox. However, having entered statistics from the applied statistics perspective I realize mathematical statisticians might differ in this case.Ostracon (talk) 19:47, 29 April 2012 (UTC)

What is here was extracted from the main Bayesian inference article so as to reduce the overwhelmingness of "history" there (and avoid having it in other articles as well), similarly for "design of experiments". Go ahead and expand the article for better balance if you can. I don't think there are already any portions of other articles that can be conveniently copied in. Melcombe (talk) 23:17, 30 April 2012 (UTC)

Move portions proposal[edit]

There is some somewhat out of place historical background on "design of experiments" in Multifactor design of experiments software, that would usefully expand the "design of experiments" section in history of statistics. Melcombe (talk) 22:07, 5 May 2012 (UTC)

Section moved, but could still possibly do with reduction. Melcombe (talk) 22:11, 1 June 2012 (UTC)

New content proposals[edit]

My impression is that the section omits everything modern-and-important; it makes me think of reading a History of PCs that ends after it describes MS-DOS.

I have some general suggestions that I don't know how to organize, and I have a particular proposal for a new section. In general: It seems to me that something should be said about the broad application of statistics other than those for Experimental Designs. There have long been observational studies, which were provided with better prospective by the Bradford Hill rules (1964 or so). The US government has supported epidemiological work since the predecessor of NIH was created in the 1870s. "Observational" statistics were the mainstay of broad incorporation of statistics - both descriptive and inferential - into all the social sciences, including (even) history. Economics is largely "statistics" these days. Should not these areas of application be mentioned?

Also, nothing is said about developments of statistical queuing theory and related network theory over 40 years. There were various developments by the US government + Ma Bell, many of them statistical. My impression is that some of these were seriously important, not just pragmatically (The Web) but in mathematical theory.

And now, in particular: About my suggestion for a new section. I have only made a couple of tiny Wikip edits before, so I have questions. Does this addition proposed below seem reasonable? Should I just expand it a bit, try to provide references, and shove it in as best I can? or is there a way to get help with it?

The intro to the History of Statistics includes statistics as vital statistics, etc. So, I suggest there should be a section on "Development of data gathering and reporting." This might be organized along the lines of applied technologies over the last 125 years. In the most recent years, the popular "gathering and reporting" may be based on millions of sales or billions of time-series readings, and summaries and available for general information or for more carefully drawing inferences. Or, on the other hand, "apps" allow individuals to collect personalized information that was once not imagined. So here are some crude elements I propose for a section -

Development of data gathering and reporting.

Hollerith cards facilitated the 1890 census (and thereafter, insurance and other business, using plug-programmed machines. See history of IBM?). Binet developed IQ tests and the military started looking at qualifications of soldiers for WWI. WWII war effort (US, GB - Lord Keynes) provided the first aggregation of industry reports to estimate GNP/GDP for insight into the vital sectors of the economy on both sides. 1950s: Optical scan scoring and school achievement tests used Likert, etc. test scoring theory from the 1930s. ("Baby-boomers" becoming the most "tested" generation.) Political polling became even more popular after the debacle of 1948 (mis-predicting "Truman loses"). 1960s: computers: mag-tape allowed analyses of large files; IBM built large disk capacities for businesses. 1970s: affordable computers with enough capacity that regressions expand analyses from (say) 3 variables and dozens of cases to dozens of variables and thousands of cases. 1980s: computer speeds became fast enough to enable (a) "robust" statistics based on ranks and medians, or on boot-strapping; and (b) iterative procedures based on Maximum Likelihood such as logistic regression and mixed models with missing data. 1990s: PC chips became cheap enough that work-stations develop into POS (point of sale) data collection and other time series. Large disk space for data helped to promote "data mining" as a new product capability, replacing its previous use as a term used for sneering. 2000s: Google provided fast access to data in the home or office. Cellphone apps shortly after provided similar searches everywhere, plus access to other (data-providing) apps. The Netflix prize, awarded in 2009, promoted both the idea and the practice of tailoring predictions of preferences for individuals.

RichardFloyd (talk) 05:46, 3 June 2014 (UTC)RichardFloyd

- I have added some sentences to the Intro in order to set the stage for extra information about how strong the effect of computers has been.  

I think I will cite John Tukey as a major influence for statistics in the last half century. RichardFloyd (talk) 00:15, 16 June 2014 (UTC)Rich U

Revision of Origins[edit]

Statistics began as wholly engaged with society, whether considering the contributions of Florence Nightingale or in the creation of life insurance by the mutual assurances societies. To dismiss these efforts, often graphically based, with paragraphs like

The first statistical bodies were established in the early 19th century. The Royal Statistical Society was founded in 1834 and Florence Nightingale, its first female member, pioneered the application of statistical analysis to health problems for the furtherance of epidemiological understanding and public health practice. However, the methods then used would not be considered as modern statistics today.

is not being true to this history. In fact, the graphical elements of Statistics have seen a resurgence, since it is crucial to convey the import of results to non-statisticians. And Mathematical Statistics has itself been shoved aside to make room for Statistics based upon numerical experiments, principally through the use of Monte Carlo methods.


I don't know where the ideas that the American Statistical Association was so limited in its choice of major statisticians came from, but from its site the set of such statisticians is broader than what has been suggested above. For example Nightingale and Cornfield are both included, with Cornfield being acknowledged as a strong Bayesian. It's odd G. E. P. Box is not included.

I disagree with one reviewer who argues that there is a ``bias towards Bayesian methods, whether in history or as presentation of current practice. Bayesian methods have domination has over philosophical, logical, and practical aspects of the trade. While, I do not at present have language insertions to suggest as a substitute, surely deemphasizing what's there is, or ending other than upon Bayesian techniques as a culmination are both wrong.

Information Theoretic Methods[edit]

There is an absence of discussion of the importance of information theoretic methods, grounded ultimately upon the divergence, as an importance means of model comparison and other things.

Fisher iris versicolor sepalwidth.svg This user is a member of WikiProject Statistics.

04:26, 16 July 2016 (UTC)