About 68.27% of the values lie within 1 standard deviation of the mean. Similarly, about 95.45% of the values lie within 2 standard deviations of the mean. Nearly all (99.73%) of the values lie within 3 standard deviations of the mean.
In mathematical notation, these facts can be expressed as follows, where x is an observation from a normally distributed random variable, μ is the mean of the distribution, and σ is its standard deviation:
These numerical values come from the cumulative distribution function of the normal distribution. For example, Φ(2) ≈ 0.9772, or Pr(x ≤ μ + 2σ) ≈ 0.9772. Note that this is not a symmetrical interval – this is merely the probability that an observation is less than μ + 2σ. To compute the probability that an observation is within 2 standard deviations of the mean (small differences due to rounding):
Statisticians might express these intervals as confidence intervals: is approximately a 95% confidence interval.
This rule is often used to quickly get a rough probability estimate of something, given its standard deviation, if the population is assumed normal, thus also as a simple test for outliers (if the population is assumed normal), and as a normality test (if the population is potentially not normal).
Recall that to pass from a sample to a number of standard deviations, one computes the deviation, either the error or residual (accordingly if one knows the population mean or only estimates it), and then either uses standardizing (dividing by the population standard deviation), if the population parameters are known, or studentizing (dividing by an estimate of the standard deviation), if the parameters are unknown and only estimated.
To use as a test for outliers or a normality test, one computes the size of deviations in terms of standard deviations, and compares this to expected frequency. Given a sample set, compute the studentized residuals and compare these to the expected frequency: points that fall more than 3 standard deviations from the norm are likely outliers (unless the sample size is significantly large, by which point one expects a sample this extreme), and if there are many points more than 3 standard deviations from the norm, one likely has reason to question the assumed normality of the distribution. This holds ever more strongly for moves of 4 or more standard deviations.
One can compute more precisely, approximating the number of extreme moves of a given magnitude or greater by a Poisson distribution, but simply, if one has multiple 4 standard deviation moves in a sample of size 1,000, one has strong reason to consider these outliers or question the assumed normality of the distribution.
Higher deviations 
Because of the exponential tails of the normal distribution, odds of higher deviations decrease very quickly. From the Rules for normally distributed data:
|Range||Population in range||Expected frequency outside range||Approx. frequency for daily event|
|μ ± 1σ||0.682689492137086||1 in 3||Twice a week|
|μ ± 1.5σ||0.866385597462284||1 in 7||Weekly|
|μ ± 2σ||0.954499736103642||1 in 22||Every three weeks|
|μ ± 2.5σ||0.987580669348448||1 in 81||Quarterly|
|μ ± 3σ||0.997300203936740||1 in 370||Yearly|
|μ ± 3.5σ||0.999534741841929||1 in 2149||Every six years|
|μ ± 4σ||0.999936657516334||1 in 15,787||Every 43 years (twice in a lifetime)|
|μ ± 4.5σ||0.999993204653751||1 in 147,160||Every 403 years|
|μ ± 5σ||0.999999426696856||1 in 1,744,278||Every 4,776 years (once in recorded history)|
|μ ± 5.5σ||0.999999962020875||1 in 26,330,254||Every 72,090 years|
|μ ± 6σ||0.999999998026825||1 in 506,797,346||Every 1.38 million years (history of humankind)|
|μ ± 6.5σ||0.999999999919680||1 in 12,450,197,393||Every 34 million years|
|μ ± 7σ||0.999999999997440||1 in 390,682,215,445||Every billion years|
|μ ± xσ||1 in||Every days|
Thus for a daily process, a 6σ event is expected to happen less than once in a million years. This gives a simple normality test: if one witnesses a 6σ in daily data and significantly fewer than 1 million years have passed, then a normal distribution most likely does not provide a good model for the magnitude or frequency of large deviations in this respect. In The Black Swan, Nassim Nicholas Taleb gives the example of risk models for which the Black Monday crash was a 36-sigma event: the occurrence of such an event should instantly suggest a catastrophic flaw in a model.
See also 
- "The Normal Distribution" by Balasubramanian Narasimhan
- "Calculate percentage proportion within x sigmas at WolframAlpha