Notation in probability and statistics

From Wikipedia, the free encyclopedia
Jump to navigation Jump to search

Probability theory and statistics have some commonly used conventions, in addition to standard mathematical notation and mathematical symbols.

Probability theory[edit]

  • Random variables are usually written in upper case roman letters: X, Y, etc.
  • Particular realizations of a random variable are written in corresponding lower case letters. For example, x1, x2, …, xn could be a sample corresponding to the random variable X. A cumulative probability is formally written to differentiate the random variable from its realization.
  • The probability is sometimes written to distinguish it from other functions and measure P so as to avoid having to define “P is a probability” and is short for , where is the event space and is a random variable. notation is used alternatively.
  • or indicates the probability that events A and B both occur. The joint probability distribution of random variables X and Y is denoted as , while joint probability mass function or probability density function as and joint cumulative distribution function as .
  • or indicates the probability of either event A or event B occurring (“or” in this case means one or the other or both).
  • σ-algebras are usually written with uppercase calligraphic (e.g. for the set of sets on which we define the probability P)
  • Probability density functions (pdfs) and probability mass functions are denoted by lowercase letters, e.g. , or .
  • Cumulative distribution functions (cdfs) are denoted by uppercase letters, e.g. , or .
  • Survival functions or complementary cumulative distribution functions are often denoted by placing an overbar over the symbol for the cumulative:, or denoted as ,
  • In particular, the pdf of the standard normal distribution is denoted by φ(z), and its cdf by Φ(z).
  • Some common operators:
  • X is independent of Y is often written or , and X is independent of Y given W is often written
  • , the conditional probability, is the probability of given , i.e., after is observed.[citation needed]


  • Greek letters (e.g. θ, β) are commonly used to denote unknown parameters (population parameters).
  • A tilde (~) denotes "has the probability distribution of".
  • Placing a hat, or caret, over a true parameter denotes an estimator of it, e.g., is an estimator for .
  • The arithmetic mean of a series of values x1, x2, ..., xn is often denoted by placing an "overbar" over the symbol, e.g. , pronounced "x bar".
  • Some commonly used symbols for sample statistics are given below:
  • Some commonly used symbols for population parameters are given below:
    • the population mean μ,
    • the population variance σ2,
    • the population standard deviation σ,
    • the population correlation ρ,
    • the population cumulants κr,
  • is used for the order statistic, where is the sample minimum and is the sample maximum from a total sample size n.

Critical values[edit]

The α-level upper critical value of a probability distribution is the value exceeded with probability α, that is, the value xα such that F(xα) = 1 − α where F is the cumulative distribution function. There are standard notations for the upper critical values of some commonly used distributions in statistics:

Linear algebra[edit]

  • Matrices are usually denoted by boldface capital letters, e.g. A.
  • Column vectors are usually denoted by boldface lowercase letters, e.g. x.
  • The transpose operator is denoted by either a superscript T (e.g. AT) or a prime symbol (e.g. A′).
  • A row vector is written as the transpose of a column vector, e.g. xT or x′.


Common abbreviations include:

See also[edit]


  • Halperin, Max; Hartley, H. O.; Hoel, P. G. (1965), "Recommended Standards for Statistical Symbols and Notation. COPSS Committee on Symbols and Notation", The American Statistician, 19 (3): 12–14, doi:10.2307/2681417, JSTOR 2681417

External links[edit]