# Talk:Statistical model

WikiProject Statistics (Rated Start-class, High-importance)

This article is within the scope of the WikiProject Statistics, a collaborative effort to improve the coverage of statistics on Wikipedia. If you would like to participate, please visit the project page or join the discussion.

Start  This article has been rated as Start-Class on the quality scale.
High  This article has been rated as High-importance on the importance scale.

Could this be explained for the layman?

## Introduction

The Introduction to this article should be generally readable by people who have little training in statistics. The Introduction was previously in need of improvement.

Then someone (Kri) made a change so that the Introduction began with this sentence: "A statistical model is a formalization of stochastic relationships between variables in the form of mathematical equations". The sentence is incomprehensible to most people, because they do not know what stochastic means. The justification for the change was that the term is "explained later in the paragraph". It is didactically awful to use a technical term and define the term later; a term should first be defined, at least intuitively, and then used.

I have reverted the change. I have also made an edit to hopefully improve clarity, as well as to correct an error (it is not necessary, or even usual, that the true model is in P). Further work is need though.
86.149.160.165 (talk) 17:02, 6 November 2014 (UTC)

Thank you explaining why you reverted my edit this time; reverting someone's edits without explanation is usually not a good idea as it easily can be seen as destructive to the one who made the first edit, since he obviously thought that he did something creative constructive himself.
As for my justification for the edit, I didn't mean that the term stochastic was explained later in the paragraph; what I meant was that the fact that the relationships are stochastic was stated later in the paragraph (although I used the word "explained" instead of "stated"). So I thought, why not make that statement about the relationships already the first time they are mentioned? But if you thought it was incomprehensible to most people, then maybe it was. —Kri (talk) 21:59, 6 November 2014 (UTC)
I should have explained the first time, I definitely agree, and will do so in the future. And I really appreciate your elaborating.
[I'm the same editor as before.]  86.152.238.35 (talk) 13:31, 7 November 2014 (UTC)

## Proposed merge with Statistical assumption

The article Statistical assumption makes very little sense. It claims that there are "non-modelling assumptions". Yet the set of statistical assumptions is the statistical model.

The reference given in the article [McPherson, 1990 (Section 3.3)] states the following.

The vast majority of statistical models require the assumption that the sample which provides the data has been selected by a process of random selection…. The importance of this assumption is made apparent in Chapter 5.
Where the sample members are not independently selected, there is a need to assume a structure or mechanism by which observations made on the sample members are connected. Generally, the statistical description is difficult even though the experimental description may be simple. For example, where plants are competing for light, moisture or nutrients, a strong growing plant is likely to be surrounded by weaker plants because of competition. Attempting to model the effects of this competition is not an easy task and frequently leads to the introduction of parameters of unknown value into the model.

Note the repeated use of the term "model".

The valid parts of the article Statistical assumption should be merged into the article Statistical model.
FlagrantUsername (talk) 17:44, 8 November 2014 (UTC)

It has now been over two months since the merge was proposed, and there are no comments. So, I have left the article Statistical assumption unmerged, but substantially edited the article to make things clearer.   FlagrantUsername (talk) 15:18, 17 January 2015 (UTC)

## Re Grey box completion and validation

“See also“ “Grey box completion and validation“ has been removed anonymously without explanation from this and several other topics. Following advice from Wikipedia if there are no objections (please provide your name and reasons), I plan to reinstate the reference in a weeks time.

The removed reference provides information on a general method of developing models where part of the model structure is known. In particular most models are incomplete (i.e. a grey box) and thus need completion and validation. This reference seems to be within the appropriate content of the “See also” section see Wikipedia:Manual_of_Style/Layout#See_also_section.

BillWhiten (talk) 05:30, 22 March 2015 (UTC)

My suggestion is to rename Grey box completion and validation to "Grey box model", and revise the article appropriately. Having an article about completion and validation makes little sense, and doing so while not having an article for Grey box models generally makes no sense at all.
It is also highly questionable whether you should be doing something like this that promotes your own work.
SolidPhase (talk) 21:56, 30 March 2015 (UTC)

## Example

The example is somewhat useful but loses its tractability when things get formal. The sample space ${\displaystyle S}$ is fine, defined like that, as is ${\displaystyle \Theta }$. What is ${\displaystyle P_{\theta }}$, though? Without that information (i.e. a formula), a novice reader is lost. Neither can the mapping ${\displaystyle P_{\theta }}$ be derived from information contained in the example (indeed, one would need to know the distribution of ages at the very least), nor can ${\displaystyle {\mathcal {P}}}$ be determined in the absence of the mapping. Can anyone work out the example in more detail?Athenray (talk) 08:49, 4 September 2015 (UTC)