# Notation in probability and statistics

(Redirected from Notation in probability)

Probability theory and statistics have some commonly used conventions, in addition to standard mathematical notation and mathematical symbols.

## Probability theory

• Random variables are usually written in upper case roman letters: X, Y, etc.
• Particular realizations of a random variable are written in corresponding lower case letters. For example x1, x2, …, xn could be a sample corresponding to the random variable X and a cumulative probability is formally written $P(X>x)$ to differentiate random variable from realization.
• The probability is sometimes written $\mathbb{P}$ to distinguish it from other functions and measure P so as to avoid having to define " P is a probability" and $\mathbb{P}(A)$ is short for $P(\{\omega: X(\omega) \in A\})$, where $\omega$ is an event and $X(\omega)$ a corresponding random variable.
• $\mathbb{P}(A \cap B)$ or $\mathbb{P}[A \cap B]$ indicates the probability that events A and B both occur.
• $\mathbb{P}(A \cup B)$ or $\mathbb{P}[A \cup B]$ indicates the probability of either event A or event B occurring ("or" in this case means one or the other or both).
• σ-algebras are usually written with upper case calligraphic (e.g. $\mathcal F$ for the set of sets on which we define the probability P)
• Probability density functions (pdfs) and probability mass functions are denoted by lower case letters, e.g. f(x).
• Cumulative distribution functions (cdfs) are denoted by upper case letters, e.g. F(x).
• Survival functions or complementary cumulative distribution functions are often denoted by placing an overbar over the symbol for the cumulative:$\overline{F}(x) =1-F(x)$
• In particular, the pdf of the standard normal distribution is denoted by φ(z), and its cdf by Φ(z).
• Some common operators:
• X is independent of Y is often written $X \perp Y$ or $X \perp\!\!\!\perp Y$, and X is independent of Y given W is often written
$X \perp\!\!\!\perp Y \,|\, W$ or
$X \perp Y \,|\, W$

## Statistics

• Greek letters (e.g. θ, β) are commonly used to denote unknown parameters (population parameters).
• A tilde (~) denotes "has the probability distribution of".
• The hat symbol ^ denotes an estimator of the true parameter.
• An estimate of a parameter is often denoted by placing a caret or "hat" over the corresponding symbol, e.g. $\hat{\theta}$, pronounced "theta hat".
• The arithmetic mean of a series of values x1, x2, ..., xn is often denoted by placing an "overbar" over the symbol, e.g. $\bar{x}$, pronounced "x bar".
• Some commonly used symbols for sample statistics are given below:
• Some commonly used symbols for population parameters are given below:
• the population mean μ,
• the population variance σ2,
• the population standard deviation σ,
• the population correlation ρ,
• the population cumulants κr.

## Critical values

The α-level upper critical value of a probability distribution is the value exceeded with probability α, that is, the value xα such that F(xα) = 1 − α where F is the cumulative distribution function. There are standard notations for the upper critical values of some commonly used distributions in statistics:

## Linear algebra

• Matrices are usually denoted by boldface capital letters, e.g. A.
• Column vectors are usually denoted by boldface lower case letters, e.g. x.
• The transpose operator is denoted by either a superscript T (e.g. AT) or a prime symbol (e.g. A′).
• A row vector is written as the transpose of a column vector, e.g. xT or x′.

## Abbreviations

Common abbreviations include: