Notation in probability and statistics

From Wikipedia, the free encyclopedia
Jump to: navigation, search

Probability theory and statistics have some commonly used conventions, in addition to standard mathematical notation and mathematical symbols.

Probability theory[edit]

  • Random variables are usually written in upper case roman letters: X, Y, etc.
  • Particular realizations of a random variable are written in corresponding lower case letters. For example x1, x2, …, xn could be a sample corresponding to the random variable X and a cumulative probability is formally written P(X>x) to differentiate random variable from realization.
  • The probability is sometimes written \mathbb{P} to distinguish it from other functions and measure P so as to avoid having to define " P is a probability" and \mathbb{P}(A) is short for P(\{\omega: X(\omega) \in A\}), where \omega is an event and X(\omega) a corresponding random variable.
  • \mathbb{P}(A \cap B) or \mathbb{P}[A \cap B] indicates the probability that events A and B both occur.
  • \mathbb{P}(A \cup B) or \mathbb{P}[A \cup B] indicates the probability of either event A or event B occurring ("or" in this case means one or the other or both).
  • σ-algebras are usually written with upper case calligraphic (e.g. \mathcal F for the set of sets on which we define the probability P)
  • Probability density functions (pdfs) and probability mass functions are denoted by lower case letters, e.g. f(x).
  • Cumulative distribution functions (cdfs) are denoted by upper case letters, e.g. F(x).
  • Survival functions or complementary cumulative distribution functions are often denoted by placing an overbar over the symbol for the cumulative:\overline{F}(x) =1-F(x)
  • In particular, the pdf of the standard normal distribution is denoted by φ(z), and its cdf by Φ(z).
  • Some common operators:
  • X is independent of Y is often written X \perp Y or X \perp\!\!\!\perp Y, and X is independent of Y given W is often written
X \perp\!\!\!\perp Y \,|\, W or
X \perp Y \,|\, W


  • Greek letters (e.g. θ, β) are commonly used to denote unknown parameters (population parameters).
  • A tilde (~) denotes "has the probability distribution of".
  • Placing a hat, or caret, over a true parameter denotes an estimator of it, e.g., \widehat{\theta} is an estimator for \theta.

Critical values[edit]

The α-level upper critical value of a probability distribution is the value exceeded with probability α, that is, the value xα such that F(xα) = 1 − α where F is the cumulative distribution function. There are standard notations for the upper critical values of some commonly used distributions in statistics:

Linear algebra[edit]

  • Matrices are usually denoted by boldface capital letters, e.g. A.
  • Column vectors are usually denoted by boldface lower case letters, e.g. x.
  • The transpose operator is denoted by either a superscript T (e.g. AT) or a prime symbol (e.g. A′).
  • A row vector is written as the transpose of a column vector, e.g. xT or x′.


Common abbreviations include:

See also[edit]


  • Halperin, Max; Hartley, H. O.; Hoel, P. G. (1965), "Recommended Standards for Statistical Symbols and Notation. COPSS Committee on Symbols and Notation", The American Statistician 19 (3): 12–14, doi:10.2307/2681417, JSTOR 2681417 

External links[edit]