Welcome to the statistics portal
Statistics is a mathematical science pertaining to the collection, analysis, interpretation or explanation, and presentation of data. It is applicable to a wide variety of academic disciplines, from the natural and social sciences to the humanities, government and business.
Statistical methods are used to summarize and describe a collection of data; this is called descriptive statistics. In addition, patterns in the data may be modeled in a way that accounts for randomness and uncertainty in the observations, and then used to draw inferences about the process or population being studied; this is called inferential statistics.
Statistics arose no later than the 18th century from the need of states to collect data on their people and economies, in order to administer them. The meaning broadened in the early 19th century to include the collection and analysis of data in general.
|A chimpanzee and a typewriter
The infinite monkey theorem states that a monkey hitting keys at random on a typewriter keyboard for an infinite amount of time will almost surely type a given text, such as the complete works of William Shakespeare.
In this context, "almost surely" is a mathematical term with a precise meaning, and the "monkey" is not an actual monkey; rather, it is a metaphor for an abstract device that produces a random sequence of letters ad infinitum. The theorem illustrates the perils of reasoning about infinity by imagining a vast but finite number, and vice versa. The probability of a monkey typing a given string of text exactly, as long as, for example, Shakespeare's Hamlet, is so tiny that, were the experiment conducted, the chance of it actually occurring during a span of time of the order of the age of the universe is minuscule but not zero.
In 2003, an experiment was performed with six Celebes Crested Macaques, but their literary contribution was five pages consisting largely of the letter 'S'.
William Sealy Gosset (1876–1937) is better known by his pen name Student and gave this name to Student's t-test and Student's t-distribution. He joined the Dublin brewery of Arthur Guinness & Son in 1899, where he applied his statistical knowledge both in the brewery and on the farm to the selection of the best yielding varieties of barley. Gosset's key 1908 papers addressed the brewer's concern with small samples. To prevent further disclosure of confidential information, Guinness prohibited its employees from publishing any papers regardless of the contained information, so Gosset used the pseudonym Student for his publications to avoid their detection by his employer.
Featured and good articles
These are featured or good articles on statistics topics.
- Featured articles
- Featured lists
- Good articles
Related projects and portals
Simpson's paradox for continuous data: a positive trend appears for two separate groups (blue and red), a negative trend (black, dashed) appears when the data are combined.
Did you know?
- ... that Alec Gallup, co-chairman of The Gallup Organization and the son of founder George Gallup, was described as someone who could "smell out a bad question or an unreasonable interpretation of data"?
- ... that the convergence of the iterative proportional fitting method for estimating the cell values of a contingency table was re-proved using differential geometry?
- ... that statistical properties dictated by Benford's Law are used in auditing of financial accounts as one means of detecting fraud?
- ... that Henry Mann's 1949 book, Analysis and design of experiments, filled mathematical gaps in the statistical writings of Ronald A. Fisher?
- ... that Gustav Elfving invented the optimal design of experiments, and so minimized the cost of a cartographic survey, while trapped in his tent in storm-ridden Greenland?
- ... that in 2009, Revolution Analytics named Norman H. Nie, one of the original SPSS developers, as their new CEO?
- ... that proper design of a sampling frame can be crucial in statistical research?
- ... that least-squares spectral analysis is a method for estimating a frequency spectrum, based on a least squares fit between data and trigonometric functions?
- ... that the Holtsmark distribution was proposed in 1919 as a model for the gravitational field of stars?
Topics in Statistics
Click an arrow symbol to expand any of the sub-categories: