Statistical methods are used to summarize and describe a collection of data; this is called descriptive statistics. In addition, patterns in the data may be modeled in a way that accounts for randomness and uncertainty in the observations, and then used to draw inferences about the process or population being studied; this is called inferential statistics.
Statistics arose no later than the 18th century from the need of states to collect data on their people and economies, in order to administer them. The meaning broadened in the early 19th century to include the collection and analysis of data in general.
Maximum likelihood estimation (MLE) is a popular statistical method used for fitting a mathematical model to data. The modeling of real world data using estimation by maximum likelihood offers a way of tuning the free parameters of the model to provide a good fit. The method was pioneered by Sir Ronald A. Fisher between 1912 and 1922. For a fixed set of data and underlying probability model, maximum likelihood picks the values of the model parameters that make the data "more likely" than any other values of the parameters would make them. Maximum likelihood estimation gives a unique and easy way to determine solution in the case of the normal distribution and many other problems.
William Sealy Gosset (1876–1937) is better known by his pen name Student and gave this name to Student's t-test and Student's t-distribution. He joined the Dublin brewery of Arthur Guinness & Son in 1899, where he applied his statistical knowledge both in the brewery and on the farm to the selection of the best yielding varieties of barley. Gosset's key 1908 papers addressed the brewer's concern with small samples. To prevent further disclosure of confidential information, Guinness prohibited its employees from publishing any papers regardless of the contained information, so Gosset used the pseudonym Student for his publications to avoid their detection by his employer.
Anscombe's quartet comprises four datasets which have identical simple statistical properties (mean, standard deviation, correlation, etc), yet which are revealed to be very different when inspected graphically. Each dataset consists of eleven (x,y) points. They were constructed in 1973 by the statisticianF.J. Anscombe to demonstrate the importance of graphing data before analyzing it, and of the effect of outliers on the statistical properties of a dataset.
... that while the center of gravity for a set of points is located at the spot from which the sum of the squares of distances to all the points is minimized, the geometric median is the spot from which the sum of distances is minimized?