|Statistical population has been listed as a level-4 vital article in Mathematics. If you can improve it, please do. This article has been rated as Start-Class.|
|WikiProject Statistics||(Rated Start-class, High-importance)|
- Yes. The article is not now very well written; maybe I'll work on it at some point. Michael Hardy 18:43, 24 Dec 2003 (UTC)
Yes, I have (also?) had trouble understanding this article. At the time of writing it defines a "Statistical Population" in terms of a mathematical function applied to a "Population". Surely if the definition appears to be defined in terms of itself then this should be expained (i.e. "Population" is defined in terms of "Population"), (i.e. self-referentially), (which I think is/should be generally recognized as a bad principle for encyclopedia articles).
Surely what we are after here is a simple logical explanation of what the term "Statistical Population" means to ordinary people with some special interest, with minimal reference to existing specialized conventions.
I think an encyclopedia deserves to have the real meaning explained, not a (possibly shallow) reiteration of existing specialized and formalized definitions put into words.
I think the aim here is conceptual illustration, not rewordings of conventional formal definitions, exhibited without the original needs, understanding and explanation.
I also think Wikipedia could outline some policy regarding the preferable avoidance of re-worded versions of specialized conventional formal definitions, instead recommending tutorial explanation of abstract concepts in terms of more familiar down-to-earth concepts. Otherwise people need to do the three year university course to understand the article. (and possible the doctorate!)
Suggestion for an example-based introduction
(I did not want to add this, since I am not even a math student, and I could be wrong on some points. But I believe that this approach will help many, many more readers understand the concept. I hope someone with more statistics knowledge could read it and consider adding it, instead of the current crows example. Note that my example also contains crows ;) )
West Hill Highschool has 422 students. The principle wants to know the average grade of the students. Fortunately, all the grades are written down in a protocol, and he can easily calculate the average. In this case, the set of all students is the population.
He also wants to know what the average weight is for the students, but this is not written down anywhere. He could ask them all to show up for weighing, but it would be a timeconsuming process. Instead he takes a sample of the students (population), i.e. a subset. With a sample of 50 students, he is able to do a reasonable estimate of the average weight of the all students (population).
In the highschool example, he knows exactly how many students there are, and if worth the trouble he could actually examine each entity in the population. But in other cases, the population can be of unknown size. E.g. the number of crows in a county. And even if the size of the population is known, it can be impossible to collect data for every entity.
Knowing the size of a population does not allow us to make better estimates, but it allows us to better know how much to trust in our estimates. Statistics is not simply about giving estimates, but about calculating the probability of different prognoses being true. Say we have made an estimate of the average weight of the crows, by weighing a sample of 25 and the average weight was 1000g. We might then estimate the average weight of the population to be 1000g, and claim, that the average weight of the whole population must be between 900g and 1100g. If there is a total of 30 crows in the county, that gives our claim a high probability of being true. However, if there are 50.000 crows in the county, the probability of our claim being true is smaller.
There are also populations that are infinite. Say we want to measure the mean temperature over a year. The population is points in time in that year. We could do measurements, every day, every hour, every minute, etc. But even if we make a measurement a million times per second, we still have not covered all points in time.
The terms sample and population make most sense in relation to eachother.
German population is ambiguous: if by that is meant the people with German citizenship, then all do not have the same "genetic heritage" for one thing. — Preceding unsigned comment added by 2A01:E35:8AD5:C150:1935:D725:D0F5:9C34 (talk) 22:38, 6 August 2014 (UTC)