Gy's sampling theory

From Wikipedia, the free encyclopedia
Jump to: navigation, search

Gy's sampling theory is a theory about the sampling of materials, developed by Pierre Gy from the 1950s to beginning 2000s[1] in articles and books including:

  • (1960) Sampling nomogram
  • (1979) Sampling of particulate materials; theory and practice
  • (1982) Sampling of particulate materials; theory and practice; 2nd edition
  • (1992) Sampling of Heterogeneous and Dynamic Material Systems: Theories of Heterogeneity, Sampling and Homogenizing
  • (1998) Sampling for Analytical Purposes

The abbreviation "TOS" is also used to denote Gy's sampling theory.[2]

Gy's sampling theory uses a model in which the sample taking is represented by independent Bernoulli trials for every particle in the parent population from which the sample is drawn. The two possible outcomes of each Bernoulli trial are: (1) the particle is selected and (2) the particle is not selected. The probability of selecting a particle may be different during each Bernoulli trial. The model used by Gy is mathematically equivalent to Poisson sampling.[3] Using this model, the following equation for the variance of the sampling error in the mass concentration in a sample was derived by Gy:

V = \frac{1}{(\sum_{i=1}^N q_i m_i)^2} \sum_{i=1}^N q_i(1-q_i) m_{i}^{2} \left(a_i - \frac{\sum_{j=1}^N q_j a_j m_j}{\sum_{j=1}^N q_j m_j}\right)^2 .

in which V is the variance of the sampling error, N is the number of particles in the population (before the sample was taken), q i is the probability of including the ith particle of the population in the sample (i.e. the first-order inclusion probability of the ith particle), m i is the mass of the ith particle of the population and a i is the mass concentration of the property of interest in the ith particle of the population.

It is noted that the above equation for the variance of the sampling error is an approximation based on a linearization of the mass concentration in a sample.

In the theory of Gy, correct sampling is defined as a sampling scenario in which all particles have the same probability of being included in the sample. This implies that q i no longer depends on i, and can therefore be replaced by the symbol q. Gy's equation for the variance of the sampling error becomes:

V = \frac{1-q}{q M_\text{batch}^2} \sum_{i=1}^N m_{i}^{2} \left(a_i - a_\text{batch} \right)^2 .

where abatch is the concentration of the property of interest in the population from which the sample is to be drawn and Mbatch is the mass of the population from which the sample is to be drawn. It has been noted that a similar equation had already been derived in 1935 by Kassel and Guy.[4][5]

See also[edit]

References[edit]

  1. ^ Gy, P (2004), Chemometrics and Intelligent Laboratory Systems, 74, 61-70.
  2. ^ K.H. Esbensen. 50 years of Pierre Gy's “Theory of Sampling”—WCSB1: a tribute. Chemometrics and Intelligent Laboratory Systems. Volume 74, Issue 1, 28 November 2004, pages 3–6.
  3. ^ Geelhoed, B.; Glass, H. J. (2004). "Comparison of theories for the variance caused by the sampling of random mixtures of non-identical particles". Geostandards and Geoanalytical Research 28 (2): 263–276. doi:10.1111/j.1751-908X.2004.tb00742.x. 
  4. ^ Kassel, L. S.; Guy, T. W. (1935). "Determining the correct weight of sample in coal sampling". Industrial and Engineering Chemistry Analytical Edition 7 (2): 112–115. 
  5. ^ Cheng, H.; Geelhoed, B.; Bode, P. (2011). "A Markov Chain Monte Carlo comparison of variance estimators for the sampling of particulate mixtures". Applied Stochastic Models in Business and Industry. doi:10.1002/asmb.878.