Jump to content

Talk:Kriging

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by Antro5 (talk | contribs) at 18:27, 9 February 2007 (Proposal for revision). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Repeated attempts to break the Neutral point of view rule

All articles and policies must follow Neutral point of view, Verifiability, and No original research.

This article should :

  • describe what Kriging is
  • tell where it comes from
  • say how it is used, and by who
  • say how it is connected to other interpolation and approximation methods

This article should not:

  • express the point of view of one particular person
  • say that Kriging is good or bad
  • be specialist-only understandable

—The preceding unsigned comment was added by 160.228.120.4 (talk) 10:23, 8 February 2007 (UTC).[reply]


Ongoing discussion with Merksmatrix about the NPOV

Dear Merksmatrix

First, I think you do not understand very well what linear prediction is and what Kriging means. To my opinion, you tend to confuse the data and the probabilistic model. Do you want to prevent people from fitting linear models because the underlying process that generated the data may not be that linear ? Anyway, if people want to use Kriging, why do you want to prevent them ?


Why do you persist to use wikipedia to diffuse your own point of view, against the NPOV ?


What I do understand is that assuming continued mineralization between boreholes does not make sense. You can do whatever you like but you ought to study Matheron's seminal work before you assume continuity between measured values in ordered sets, interpolate by kriging, select the least biased and most precise subset of some infinite set of kriged estimates, smooth its pseudo kriging variance to perfection and rig the rules of classical statistics in the process. Please do sign your message!!!--Merksmatrix 19:40, 8 February 2007 (UTC)[reply]



  • First question: do you acknowledge that you are breaking the NPOV ?

To my opinion, you are breaking the NPOV, for the very reason that you are claiming that Kriging is not statistically well-founded (which, to my opinion, is not an interesting point of view).

Whether you do or do not acknowledge, I propose the article be reverted to a neutral form till a solution is settled. Any revert without justification may be consider as vandalism. If you want to modify the article, do not break the NPOV. In particular, stop using some serpentine ways, by for instance, cluttering the article with specialist-only understandable lingo.

  • Second question: do you really think that Matheron's seminal work has importance to explain what Kriging is ?

I would like to point out that i have read some of his work. I personnally find a lot of his notes quite useless and besides, very difficult to read (this is my point of view). What is important for someone who wants to know about Kriging, is to understand what Kriging is, and why it is used.

  • Third question : do you really consider yourself as a scientist ?

In science, if someone finds something not suited for his purpose, nobody will prevent this person from using something else. If you have better to propose, make a publication ! Be a scientist, not a religionist.

Antro5 18:16, 9 February 2007 (UTC)[reply]

Proposal for revision (OLD ?)

This article gives a brief overview of what Kriging is and describes it using many links to other (complex) entities. I would like to make this article more self-contained and give some insight on the ideas behind Kriging and what are it's pros and cons.

I propose the following sections:

  1. Idea(s) behind Kriging
  2. Does each kriged estimate have its own variance?
  3. Simple Kriging
  4. Best Linear Unbiased Estimator
  5. Pro's and Con's
  6. Extensions of Simple Kriging
  7. Software

-- Scheidtm 19:59, 15 March 2006 (UTC)[reply]

Comments

Sounds good, but isn't the Best Linear Unbiased Estimator a consequence of the Gauss-Markov theorem ? Do you need a whole section to explain it? -- hike395 02:22, 16 March 2006 (UTC)[reply]
Hmmm, I am not that familiar with Gaussian processes. But "Locality" would be a good substitute anyway. -- Scheidtm 21:16, 16 March 2006 (UTC)[reply]

Can any Wikipedian tell me whether or not each distance-weighted average had its own variance before it was reborn as variance-deprived but honorific kriged estimate? That’s the crux of the matter! The rest are details! Please be concise and succinct for a change because I've been fed circular logic and opaque dogma by the geostatistical fraternity since the early 1990s.

I know spatial dependence may be assumed because Journel said so in 1992. The original reference behind Journel’s cryptic remark (“a decision rather”) ought to be posted under References where the first three seminal textbooks on geostatistical fiction should be similarly honored. Another work of sublime interest is Armstrong and Champigny's A Study of Kriging Small Blocks, in which the authors caution against oversmoothing. Apparently, the requirement of functional independence can be violated a little but not a lot. What I enjoy more than most people is fuzzy logic. Invoking WP’s vanity policy when authors refer to their own reviewed and published works reflects a subtle sense of humor.--Iconoclast 17:45, 13 April 2006 (UTC)[reply]

We're not invoking the vanity policy, but WP:NOR. You have read it, yes?
With a bit of reflection, you will see that it is impossible to write a collaborative encyclopedia, one which anyone can edit, without specifically disallowing original research from each contributor. By forcing all editors to provide verifiable sources, attributable to others, not themselves, and to cite them, we have in place a mechanism which avoids endless, frustrating, back-and-forth edit wars.
Can you provide a source for your assertions, which is not written by yourself? That is the crux of the matter. Antandrus (talk) 00:45, 14 April 2006 (UTC)[reply]

A question about the variance of "samples with different weights" was posed on AI-Geostats Open Website on October 7, 2005, and the formula was posted on October 10, 2005. The webmaster didn't post the entire exchange in which several subscribers took part. Plain logic dictates that this variance formula applies not only to area, count, density, length, mass and volume-weighted averages but also to distance-weighted averages aka kriged estimates. I would have been aware if some geostatistical scholar had issued an exclusion edict for kriged estimates. However, tenets tend to change fast when common sense threathens geostatistics. Journel postulated that spatial dependence may assumed "unless proven otherwise" but was troubled that somebody would apply "Fischerian" [sic!] statistics to prove otherwise. Please let me know if more references are required. --Iconoclast 16:38, 14 April 2006 (UTC)[reply]

comments of the author of the figure

Dear all,

I think that the last version of this article has introduced confusion and inexactness. For instance, in the first paragraph, is is claimed that Krige developed Kriging. this is false. Matheron did, in the 60s, using Krige ideas published in its MSc report.

about the controversy, I would say that this is irrelevent. I do not think that this article should be the place to discuss the validity of modeling by random processes.

References are irrelevent too. Good references are Matheron's published work, Cressie, Chiles and Delfiner, Wackernagel and Stein.

At last, I would say that this is an error to think that Kriging can only be used for spatial modeling. there is not theoretical restriction to consider other types of phenomenons denpending of one, two or more factors.

Belated hello to the Author of the Figure, Please let the readers of this page know whether it makes sense to replace the variance of the single-distance-weighted average with the kriging variance of a set of kriged estimates? Is it possible that this practice violates the requirement of functional independence and ignores the concept of degrees of freedom? Does the data set in your figure display a significant degree of spatial dependence? Thanks for your response! JWM --Iconoclast 22:30, 10 July 2006 (UTC)[reply]

The Author of the Figure should peruse Matheron's introduction to Journel and Huijbregts's Mining Geostatistics to find out who coined the term geostatistics and why! It would be useful if the primary data for the Figure were posted to allow the application of a proper test for spatial dependence. JWM. --Iconoclast 18:30, 3 August 2006 (UTC)[reply]


-- Maybe we do not agree on what Kriging is exactly. Kriging starts with the hypothesis that the observations (the data) are sample values of a random process with known or unknown mean m(x) and covariance k(x,y). Note that the covariance need not to be stationary. Then, Kriging is just a linear predictor. Nothing more. The practical question is : when can we make the assumption that the observations are sample values of a random process ? The answer is, to my opinion, that it can always be done. A random process is just a model and statistics can tell us if the chosen model is probable or not.

Further revision proposal by Scheidtm

Kriging' is a regression technique used in geostatistics to approximate or interpolate data. The theory of Kriging was developed from the seminal work of its inventor, Danie G. Krige, by the French mathematician Georges Matheron in the early sixties. In the statistical community, it is also known as Gaussian process regression. Kriging is also a reproducing kernel method (like splines and support vector machines).

Figure: example of one-dimensional data interpolation by Kriging, with confidence intervals

Idea Behind Kriging

As Kriging was developed in Mining, it will be explaned in this setting here. It can and is used in other contexts, too. Please keep this in mind, when reading this article.

Kriging is often used to predict the distribution of some interesting quantity in a geological survey. For example one wants to determine the gold concentration in a mine field from a limited number of exploratory diggings.

Each of the results could be regarded as a single draw from an unkown random distribution, whose form is determined by the geological processes moving and layering the material in the neighbourhood of the place of mining. But as different places would have different geological neighourhoods and histories, the random distributions would also (slightly) differ, so that a general prediction of ore content would be difficult, because one does not know the differences between these random distributions.

Kriging escapes from these difficulties by using the prior knowledge, that these random distribution only differ slightly. It does this by treating all measurements as one draw from a single probability distribution, which is then called a random process or better a random field. The additional assumptions made about this process encode this prior knowledge, and not only allow to predict the wanted quantity, but also allow to give confidence intervalls for predictions.

Simple Kriging

  • Give assumptions of simple kriging, develop formulas for prediction, confidence intervalls.
  • correlation and standard forms (gaussian, exponential, spherical).
  • discontinuity at origin (Nugget Effect) => interpolating or smoothing
  • differentiability at origin => roughness.

Best Linear Unbiased Estimator

  • Describe features of Kriging

Pro's and Con's

  • to be developed

Extensions of Simple Kriging

  • Describe how assumptions are relaxed, what is predicted by each of the advanced Kriging methods.

Software implementing Kriging

  • Give list (does not strive to be exhaustive).
    • The Stanford Geostatistical Modeling Software ( S-GeMS )


I agree with Scheidtm's proposed reorganization of this article. However, I think it is clear that we need a better diagram that more clearly illustrates the application of the technique. Would Emmanuel be interested in producing a revised version of Example_krig.png? Matt 02:49, 22 August 2006 (UTC)[reply]

Confusing: "lost the correspondingly infinite set of variances"

I marked this article {{confusing}} because of the phrase, "lost the correspondingly infinite set of variances" in the introductory paragraph, which is not well-defined before it is used, nor wiki- or hyper-linked. I suggest that the first three paragraphs need a complete re-write as a better introduction, with less jargon and bias (2nd paragraph, hyperlinked to Geophys. web site, shows bias.) --James S. 19:16, 2 April 2006 (UTC)[reply]

I moved the two troubled paragraphs to "History" and added a {{SectPOV}} tag in front of the hyperlink. --James S. 19:20, 2 April 2006 (UTC)[reply]

The two paragraphs seem to be pushing a POV that geostatistics is some sort of hoax. This is unlikely, considering that statisticians (other than non-geostatisticians) use Gaussian Process Regression, and have shown that it is a Bayesian technique (where the kernel function describes a Gaussian Process Prior over functions).
I saved the list of methods named after Krige, but deleted the POV. -- hike395 21:16, 2 April 2006 (UTC)[reply]
I think I finally understand Dr. Merks' objection --- in the Bayesian analysis, spatial dependence is an assumption, while Jan is advocating performing statistical tests on the spatial dependence before blindly using kriging. The latter is a frequentist viewpoint (as I understand it). I did some quick research on what statistical tests are commonly used in spatial statistics, found three, and cited them. -- hike395 16:00, 7 April 2006 (UTC)[reply]

In mathematical statistics, one-to-one correspondence between central values (the arithmetic mean and various weighted averages) and their variances is sine qua non. In geostatistics, however, one-to-one correspondence between distance-weighted averages-cum-kriged estimates and their variances is null and void. In other words, the infinite set of variances was lost on Krige's watch and the variance of the SINGLE distance-weighted average was replaced with the perfectly smoothed pseudo kriging variance of a SUBSET of some infinite set of kriged estimates! Geostatistics is a scientific fraud because spatial dependence between (temporally or in situ ) ordered sets is assumed! Remember Bre-X. That's all!--Iconoclast 00:53, 8 April 2006 (UTC)[reply]

I believe I addressed your objections in a way that is NPOV and verifiable --- some people assume spatial dependence, other people test for it. Citations for both viewpoints are included in the article. -- hike395 21:29, 8 April 2006 (UTC)[reply]

latest revert

Two problems with the article, that I reverted:

  1. The previous version claimed that Krige knew certain facts. This is very difficult to verify: a high standard is needed. Do we have any citations to show what Krige was thinking of?
  1. The paragraph about Fisher's F-test. Again, this seems like original research. I can only find material about applying that particular test from Dr. Merks himself (his web site [1], comments at ai-geostats [2], comments at amazon.com[3]) and no place else. Again, if this is supported in the common literature, I'd be happy to add it to the paragraph that lists common statistical tests applied to spatial data.

-- hike395 21:37, 8 April 2006 (UTC)[reply]

My two cents

I'm going to chime in here: while I appreciate Mr. Merk's contributions, I need to emphasise that our core policies include no original research, and in this case that means including information which is not verifiable by reference to published sources not by the contributing author. Kriging is accepted both by the scientific community and by policy makers worldwide. Continued insertion of the disputed material is in violation of our POV policy as well as NOR and V. Thanks! Antandrus (talk) 18:27, 10 April 2006 (UTC)[reply]

Fact or Fiction

Sir Ronald A Fisher was knighted in 1953 because of his work on analysis of variance, the essence of which is his F-test. It was Snedecor who called it Fisher's F-test. One might suggest that Fisher's F-test does not qualify as "original research" under WP's core policies. I don't know what Krige "knew" but what I do know is he didn't know each and every distance-weighted average had its own variance long before Fisher was knighted. It would be a lot worse if Krige did know about one-to-one correspondence between distance-weighted averages and variances but decided to ignore it. Neither do I know if Matheron and his students knew that its rebirth as an honorific kriged estimate would make its variance vanish without leaving a trace in geostatistical literature. If fact, I know very little because prominent geostatisticians rather assume, krige, smooth and rig the rules of mathematical statistics than respond to the simple question: Does or doesn't each kriged estimate have its own variance? What a pity that this question violates WP's core NPOV policy! So why not play Clark and the Kriging Game rather than waffle with weasel words? By the way, the ordered set of data in the above figure does not display a significant degree of spatial dependence. Wikipedians ought to check that out! --Iconoclast 16:17, 12 April 2006 (UTC)[reply]

The description of the F-test is not original research, talking about Ronald A Fisher may not be. However, you yourself have said that the application of the F-test to spatial dependency is not generally accepted in geology. I can't find any other references to the use of the F-test applied to spatial dependency, other than your own work. Therefore, the application of the F-test is original research, according to the WP rules.
Asking questions on Talk pages does not violate NPOV. WP:NPOV talks about the phrasing of the content of an article. If you say "Kriging is clearly invalid, because of blah blah blah", that's an POV phrasing. It's like journalism, you have to use "he said/she said" language. An NPOV phrasing, for example, would be:
Kriging is a commonly applied technique to model distribution of ore.[1] However, some practitioners question the assumption that spatial dependence follows a stochastic process.[2] Other practitioners recommend using statistical tests to test the assumption of spatial dependency.[3][4][5]
See what I mean? The article doesn't say that the field is invalid (that's a particular Point of View). Perhaps it should say that kriging is commonly used, but some people question the assumptions and/or use statistical tests to check the assumptions.
-- hike395 09:43, 13 April 2006 (UTC)[reply]

References

  1. ^ Cressie, Noel A.C. (1993). Statistics for Spatial Data. Wiley-Interscience.
  2. ^ Philip, G. M. (1986). "Matheronian Statistics --- Quo vadis?". Mathematical Geology. 18 (1): 93–117. {{cite journal}}: Unknown parameter |coauthors= ignored (|author= suggested) (help)
  3. ^ Fortin, Marie-Josee (2005). Spatial Analysis: A Guide for Ecologists. {{cite book}}: Cite has empty unknown parameter: |1= (help); Unknown parameter |coauthors= ignored (|author= suggested) (help)
  4. ^ Ullah, Ullah (1998). Handbook of Applied Economic Statistics. p. 265.
  5. ^ Schabenberger, Oliver (2001). Contemporary Statistical Models for the Plant and Soil Sciences. p. 653. {{cite book}}: Unknown parameter |coauthors= ignored (|author= suggested) (help)

Making this page useful - Give sources or get out

The continued resistance of the one "author" here to provide additional citations to back up his beefs has rendered this entry utterly useless. Quit trying to impose your squatter's rights on the discussion and abide by the request or leave it be. Using Wikipedia to direct people to your site is crappy - this is the ONLY page I've seen this problem persist by such stubborn dogma. Dogma is opinion, not informed, collaborative dissent and disagreement. You clearly are confusing your role here as an "educator" and instead are an impediment (and frankly a parriah in my eyes) to my understanding since I can't verify what you're saying because you can't be bothered.

This comment additionally applies to all the other connected concepts that your put under the umbrella of your disagreement with kriging (do you contest variograms and semi-variograms really or jsut kriging?). Please... GET ON WITH IT, or over it.

209.116.30.220 18:13, 24 July 2006 (UTC)[reply]

I'm attempting to do what needs to be done to ensure that scientific integrity and sound science prevail on Wikipedia. I'll post more references if and when required. Wouldn't it be of interest to verify whether the primary data set for the kriging figure displays a significant degree of spatial dependence? You were talking to the undersigned, weren't you? Anonymity is somewhat confusing! JWM. --Iconoclast 16:00, 25 July 2006 (UTC)[reply]

I do not object to the inclusion of a section, 'Controversy', that questions the validity of the statistical technique, based on referenced sources. However, I don't think this article requires 8 references to your own published works (perhaps your user page would be a more appropriate place?). Furthermore, it is my opinion that the opening paragraph of this article should introduce the topic, Kriging, in a manner that is accessible to the encyclopedia reader. Launching straight into a discussion of "what Krige, Matheron and his following did not know in those days" seems to obfuscate rather than elucidate Matt 01:19, 22 August 2006 (UTC)[reply]

Sorry, Matt, but I question the validity of the geostatistial technique of assuming spatial dependence, interpolating by kriging, smoothing pseudo kriging variances, and rigging the rules of mathematical statistics. Why not have somebody explain what kriging is really all about? And what about verifying spatial dependence between the ordered set of measured values in the above Figure1? JWM. --Iconoclast 18:47, 22 August 2006 (UTC)[reply]

Hi Jan, I didn't mean to imply that your contributions to this article are unimportant. However, in my opinion the Kriging article should primarily be aimed at introducing the topic to readers who are unfamiliar with the technique (and possibly with geotatistics in general). It is first required to explain exactly what kriging is, before its shortcomings can be adequately addressed. A prominent and detailed Controversy section serves the purpose of warning the reader to treat the technique with caution, and not to accept its conclusions at face value. --Matt 12:50, 27 August 2006 (UTC)[reply]

could someone include usage in a sentence? I've found this useful on other WP pages that give it at the top when capitilization is a question. Didn't want to screw it up, so I'll let one of the many debating experts here decide whether to include it.

Make Information, not War

I came to the Kriging page in order to understand what kriging is, since I encountered the term in a software package (in non-geostatistical context -- it had to do with interpolating sampled elevation points). I expected to:

  1. learn how data are interpolated in the kriging method
  2. find at least one equation defining the method
  3. learn how kriging compares to other methods of interpolation: linear, quadratic, spline, etc.
  4. see a diagram of kriged data, preferably compared with diagrams of data interpolated by other means
  5. learn the relative strengths and shortcomings of this method of interpolation

But I was disappointed in that respect. On the other hand, I do not give a rat's fart about:

  1. the wickedness of prof. Krige
  2. the metaphysical issues of having one's own variance
  3. historical references
  4. name-calling among prominent geostatisticians
  5. correct capitalization of the word “kriging”

The only useful information I found was buried halfway down the page and read: “The Kriging estimate is a weighted linear combination of the data. The weights that are assigned to each known datum are determined by solving the Kriging system of linear equations, where the weights are the unknown regression parameters. The optimality criterion used to arrive at the Kriging system, as mentioned above, is a minimization of the error variance in the least-squares sense.” However, and very regrettably, the alluded-to set of linear equations was not given anywhere on the page.

Does anyone here have the discipline to adequately explain and illustrate the term in question before launching into controversies and edit wars? The article as it stands now consists of a lot of obscure discussion of abstruse side-issues, with regard to a main topic that is not even decently summarized. I do realize that the editors are all expert geostatisticians, who know kriging as the back of their hand; but most encyclopedia readers have no such prior knowledge, and expect to find it in the article. Respectfully yours, Freederick 15:16, 6 November 2006 (UTC)[reply]


A short tutorial on Kriging

The following paragraphs come from a paper that I started to write but never finished.

-- The author of the Figure --

The objective of this section is to present Kriging, a method to interpolate or approximate scattered observed data, which can be used to model non-linear phenomena or complex systems in engineering. The interpolation (or approximation) is obtained by linear prediction of a spatial random process. Kriging is very computationally practical and its implementation is easy, since it consists in solving a system of linear equations. This presentation shall explain the theory of this method and shall also explain the fundamental connections between Kriging and other similar methods based on the theory of reproducing kernels, namely, radial basis functions (RBF) [1], splines \citep{schoenberg64:_splin, duchon76:_inter, Wah90} and support vector machines (SVM) related methods \citep{vapnik95nature,smola98tutorial,schol02}. The aspects concerning the choice of a kernel will also be presented.


History

Kriging originates from the early 50's work of D.G. Krige, a South-African mining engineer whose aim was to elaborate maps of ore grade from scattered samples \citep{krige51:_witwat}. The method was adapted and formalized by the French mathematician Georges Matheron, who gave it its present name \citep{Mat63}. Kriging is nowadays one of the basic tool of \emph{geostatistics}, a branch of statistics that deals with the description of phenomena involving spatial factors, such as ore prospection, meteorology, oceanology, etc. In this context, Kriging cannot be dissociated from geostatistical concepts such as \emph{stuctural analysis}, which is the step that consists in choosing a covariance function from the observed data. Geostatisticians have a long experience with data modeling and this experience proves to be helpful for the choice of a kernel, a fundamental issue in practice in reproducing kernels methods. We shall also consider \emph{Intrinsic Kriging}, an extension of Kriging also developed by the geostatisticians, which makes it possible to deal with non-stationary processes, more specifically, random processes comprising unknown trends. An overview of the history of Kriging in the context of geostatistics can be found in \citep{cressie90origin}; see also \citep{chiles99,cressie93statistics} for comprehensive references on the subject. Because of its spatial origin, Kriging has long been restricted to problems where there were only two or three factors -- corresponding to a position -- and it took quite some time to realize that it could also be used in the world of engineering, with more factors of a more diverse nature (see, e.g., \citep{Sac89}). Kriging also has strong connections with the theory of time series, and basically uses the same concepts. Note also that in the community of pattern recognition, Kriging is better known under the name of \emph{Gaussian processes} \citep{Wil95}.

Linear prediction and Kriging

Consider a \emph{system} with output denoted by $f(\x)$. The output depends on the values taken by the system inputs, denoted by a vector $\x \in \RR^d$. This vector of inputs will be referred as the \emph{factors} and can be any quantity that characterize the conditions under which the system operates. The objective of Kriging is to predict the output of the system for a given $\x$. For this purpose, a \emph{black-box model} is built based on a finite set of observations $f_{{\x}_i}$, $i \in \{1,\cdots,n\}$ of the output of this system, for various values $\x_i$, $i \in \{1,\cdots,n\}$. An observation $f_{{\x}_i}$ is not necessarily equal to $f(\x_i)$ since the output may be corrupted by a noise. Mathematically, the problem of predicting $f(\x)$, based on the observation set $(\x_i, f_{{\x}_i})$, $i=1,\cdots, n$ can be formulated as one of function approximation or interpolation.

Since the system remains uncertain despite the observations, a natural idea is to model the output of the system by a random process, denoted by $F(\x)$. The observed outputs $f_{\x_i}, i=1,\cdots,n$ are thus considered to be realizations of the random variables $F(\x_i)$. The observation noise, which can corrupt the output, is not taken into account in this first section. With this probabilistic formulation, a first approach to predict the system could be to simulate the output \emph{conditionally} to the observed random variables (see conditional simulation in annexes). Such an approach is shown on Figure~\ref{fig:simu}, where several simulated realizations, or trajectories, of the process are represented. Since each conditional trajectory interpolates the data, the simulation can be seen as one possible way of predicting the system. However, it is often preferred to choose \emph{one relevant prediction}, for instance an ``average trajectory, smoother than the realizations of the random process, in order to minimize a risk of wrong prediction.

The kriging method is to choose the \emph{best linear predictor}, which is explained in the remaining of this session. \emph{Linearity} implies that for all $\x$ the predictor $\hat{F}(\x)$ of $F(\x)$ is obtained as a linear projection on the space $\HH_S = \mathsf{span} \{F({\x}_1), \cdots, F({\x}_n)\}$, \emph{i.e.} a linear combination written as \begin{equation}

 \label{eq:1}
 \hat{F}(\x) = \sum_{i=1}^n \lambda_i(\x) F({\x}_i)\,.

\end{equation} where $\forall i \in \{1,\cdots,n\}$, $\lambda_i(\x)\in \RR$. The \emph{best} approximation corresponds to choosing an orthogonal projection. In order to define this orthogonal projection it is assumed that the space of random variables is endowed with the with the classical scalar product, the expectation of the product of two random variables, that is, $(X,Y) = \EE[XY]$. The hypotheses on $F(\x)$ must also be specified at this stage. $F(\x)$ is assumed to be a stationary, second-order random process defined by its \emph{mean} $b=\EE[F(\x)]$ and \emph{auto-covariance function}, or in short \emph{covariance}, written as \begin{equation} \label{eq:2} R(\x,\vb{y}) = \cov [F(\x), F(\vb{y})]\,. \end{equation}

This covariance plays a fundamental role in Kriging since the prediction mainly depends on the choice of a given covariance, as will be discussed in Section~\ref{sec:choosing-covariance}. Note that the hypothesis of stationarity will be discussed in Section~\ref{sec:regul-krig} when introducing intrinsic Kriging. For the time being, it will also be assumed that $F(\x)$ is a \emph{zero-mean} process. If $b$ is known and differs from zero, it can be subtracted from $F(\x)$.


Orthogonal projection is obtained when the prediction error $\hat{F}(\x)-F(\x)$ is orthogonal to $\HH_S$, \emph{i.e.} \begin{equation} \label{eq:3} \EE[(\hat{F}(\x) - F(\x))F({\x}_i) ] = 0\,, \forall i \in \{1,\cdots , n\}\,, \end{equation} or equivalently, the variance of the prediction error, written as $\var[\hat{F}(\x) - F(\x)]$, is minimized. This is a classical least-square regression problem and its solution can be written using the well-known linear prediction formula (see Annex~1) \begin{equation} \label{eq:4} \hat{F}(\x) = \bm{\lambda}\tr \vb{F} = \vb{r}\tr(\x) \vb{R}^{-1} \vb{F}\,, \end{equation} where $\bm{\lambda}(\x)\tr = [\lambda_1(\x), \cdots, \lambda_n(\x)]$, $\vb{r}\tr(\x)$ is the row vector of covariances, $$ \vb{r}\tr(\x) = [R({\x}_1, \x), \cdots, R({\x}_n ,\x)]\,,$$ and $\vb{R}$ is the covariance matrix of the random vector $$ \vb{F} = [F({\x}_1), \cdots, F({\x}_n)]^{\mathsf{T}}\,. $$ The covariance matrix $\vb{R}$ is in general full rank so that its inverse exists (of course, one should not inverse the matrix to solve the linear system). However, when the number of observations increases the matrix can be ill-conditioned and leads to numerical instabilities.

Note that the predictor (\ref{eq:4}) is unbiased since the mean of $F(\x)$ is known. A simple example of linear prediction is illustrated by Figure~\ref{fig:ex_krig}, which represents an interpolation with the output depending on one factor only. Thus, Kriging gives the possibility to predict a system for values of factors that have not been observed. The interpolation property means that when the factors are assigned values corresponding to past observations, the prediction is equal to the already observed output. It should be also intuitive that the more observations are made the more precise the prediction becomes, which is explained below.

The main properties of Kriging are best explained by the behavior of the variance of the error of the prediction, which is given by the Pythagorean relation \begin{eqnarray}

 \label{eq:var_error}
 \var(\hat{F}(\x) -F(\x)) &=& \var F(\x) - \var \hat{F}(\x) \\
 &=& R(\x,\x) - \bm{\lambda}(\x)\tr \vb{R} \bm{\lambda}(\x) \\
 &=& R(\x,\x) - \vb{r}\tr(\x) \vb{R}^{-1} \vb{r}(\x)\,.

\end{eqnarray} It is then straightforward to assess the quality of the prediction with confidence intervals (error bars) deduced from the square root of the variance of the error (error bars are also shown on Figure~\ref{fig:ex_krig}).


To be continued

References

[1] Powell, M. J. D., Radial basis functions for multivariable interpolation: A Review, Algorithms for Approximation of Functions and Data, Oxford University Press, J.C. Mason and M.G. Cox Eds, pp 143-167, 1987

Where is the meat?

Quoting from the article: “The Kriging estimate is a weighted linear combination of the data. The weights that are assigned to each known datum are determined by solving the Kriging system of linear equations,...

Quoting from the last (anonymous) edit on the Talk Page: “Kriging is very computationally practical and its implementation is easy, since it consists in solving a system of linear equations.

Where are the goddamn equations? Are they legendary? IIUC, they should be the main point of the article, which is well-nigh useless without them. Freederick 19:45, 2 December 2006 (UTC)[reply]

Maybe you can read portuges ?
No. Freederick 22:45, 18 January 2007 (UTC)[reply]

References to Matheronian voodoo statistics ought not to be removed!--Merksmatrix 22:21, 3 February 2007 (UTC)[reply]

Dear Mr Nick Didlick aka Merksmatrix,

First, I think you do not understand very well what linear prediction is about and what Kriging means. To my opinion, you tend to confuse the data and the probabilistic model. Do you want to prevent people from fitting linear models because the underlying process that generated the data may not be that linear ? Anyway, if people want to use Kriging, why do you want to prevent them from doing that ?

Why do you persist to use wikipedia to diffuse your own point of view, against the NPOV ?

If you have business in telling revisionist stories against Kriging, good for you. But not on Wikipedia.

History section

The introduction as of now contains too much history in my opinion. I think the origin of the method should be postponed until after the method has been described, and in a dedicated History section. Berland 05:54, 6 February 2007 (UTC)[reply]