Jump to content

Data generating process

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by AnomieBOT (talk | contribs) at 12:38, 27 March 2018 (Dating maintenance tags: {{Citation needed}}). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

The term data generating process is used in statistical and scientific literature to convey a number of different ideas[citation needed]:

  • the data collection process, being routes and procedures by which data reach a database (particularly where these may change over time);
  • a specific statistical model that is being used to represent supposed random variations in observations, often in terms of explanatory and/or latent variables
  • a notional and non-specific probabilistic model (not directly described or explicitly set down) that would include all of the random influences that combine together to lead to individual observations, where one instance would be the supposed justification of the "common occurrence" of the normal distribution in terms of a combination of multiple random additive effects.