= Notation in probability and statistics =

Probability theory and statistics have some commonly used conventions, in addition to standard mathematical notation and mathematical symbols.

==Probability theory==

- Random variables are usually written in upper case Roman letters, such as $X$ or $Y$ and so on. Random variables, in this context, usually refer to something in words, such as "the height of a subject" for a continuous variable, or "the number of cars in the school car park" for a discrete variable, or "the colour of the next bicycle" for a categorical variable. They do not represent a single number or a single category. For instance, if $P(X = x)$ is written, then it represents the probability that a particular realisation of a random variable (e.g., height, number of cars, or bicycle colour), X, would be equal to a particular value or category (e.g., 1.735 m, 52, or purple), $x$. It is important that $X$ and $x$ are not confused into meaning the same thing. $X$ is an idea, $x$ is a value. Clearly they are related, but they do not have identical meanings.
- Particular realisations of a random variable are written in corresponding lower case letters. For example, $x_1,x_2, \ldots,x_n$ could be a sample corresponding to the random variable $X$. A cumulative probability is formally written $P(X\le x)$ to distinguish the random variable from its realization.
- The probability is sometimes written $\mathbb{P}$ to distinguish it from other functions and measure P to avoid having to define "P is a probability" and $\mathbb{P}(X\in A)$ is short for $P(\{\omega \in\Omega: X(\omega) \in A\})$, where $\Omega$ is the event space, $X$ is a random variable that is a function of $\omega$ (i.e., it depends upon $\omega$), and $\omega$ is some outcome of interest within the domain specified by $\Omega$ (say, a particular height, or a particular colour of a car). $\Pr(A)$ notation is used alternatively.
- $\mathbb{P}(A \cap B)$ or $\mathbb{P}[B \cap A]$ indicates the probability that events A and B both occur. The joint probability distribution of random variables X and Y is denoted as $P(X, Y)$, while joint probability mass function or probability density function as $f(x, y)$ and joint cumulative distribution function as $F(x, y)$.
- $\mathbb{P}(A \cup B)$ or $\mathbb{P}[B \cup A]$ indicates the probability of either event A or event B occurring ("or" in this case means one or the other or both).
- σ-algebras are usually written with uppercase calligraphic (e.g. $\mathcal F$ for the set of sets on which we define the probability P)
- Probability density functions (pdfs) and probability mass functions are denoted by lowercase letters, e.g. $f(x)$, or $f_X(x)$.
- Cumulative distribution functions (cdfs) are denoted by uppercase letters, e.g. $F(x)$, or $F_X(x)$.
- Survival functions or complementary cumulative distribution functions are often denoted by placing an overbar over the symbol for the cumulative:$\overline{F}(x) =1-F(x)$, or denoted as $S(x)$,
- In particular, the pdf of the standard normal distribution is denoted by $\varphi(z)$, and its cdf by $\Phi(z)$.
- Some common operators:
- $\mathrm{E}[X]$: expected value of X
- $\operatorname{var}[X]$: variance of X
- $\operatorname{cov}[X,Y]$: covariance of X and Y
- X is independent of Y is often written $X \perp Y$ or $X \perp\!\!\!\perp Y$, and X is independent of Y given W is often written
$X \perp\!\!\!\perp Y \,|\, W$ or
$X \perp Y \,|\, W$
- $\textstyle P(A\mid B)$, the conditional probability, is the probability of $\textstyle A$ given $\textstyle B$

==Statistics==

- Greek letters (e.g. θ, β) are commonly used to denote unknown parameters (population parameters).
- Some commonly used symbols for population or distribution parameters are given below:
  - the population mean $\mu$,
  - the population variance $\sigma^2$,
  - the population standard deviation $\sigma$,
  - the population correlation $\rho$,
  - the population cumulants $\kappa_r$,
- A tilde (~) denotes "has the probability distribution of".
- Placing a hat, or caret (also known as a circumflex), over a true parameter denotes an estimator of it, e.g., $\widehat{\theta}$ is an estimator for $\theta$.
- The arithmetic mean of a series of values $x_1,x_2, \ldots,x_n$ is often denoted by placing an "overbar" over the symbol, e.g. $\bar{x}$, pronounced "$x$ bar".
- Some commonly used symbols for sample statistics are given below:
  - the sample mean $\bar{x}$,
  - the sample variance $s^2$,
  - the sample standard deviation $s$,
  - the sample correlation coefficient $r$,
  - the sample cumulants $k_r$.
- $x_{(k)}$ is used for the $k^\text{th}$ order statistic, where $x_{(1)}$ is the sample minimum and $x_{(n)}$ is the sample maximum from a total sample size $n$.

==Critical values==

The α-level upper critical value of a probability distribution is the value exceeded with probability $\alpha$, that is, the value $x_\alpha$ such that $F(x_\alpha) = 1-\alpha$, where $F$ is the cumulative distribution function. There are standard notations for the upper critical values of some commonly used distributions in statistics:
- $z_\alpha$ or $z(\alpha)$ for the standard normal distribution
- $t_{\alpha,\nu}$ or $t(\alpha,\nu)$ for the t-distribution with $\nu$ degrees of freedom
- ${\chi_{\alpha,\nu}}^2$ or ${\chi}^{2}(\alpha,\nu)$ for the chi-squared distribution with $\nu$ degrees of freedom
- $F_{\alpha,\nu_1,\nu_2}$ or $F(\alpha,\nu_1,\nu_2)$ for the F-distribution with $\nu_1$ and $\nu_2$ degrees of freedom

==Linear algebra==

- Matrices are usually denoted by boldface capital letters, e.g. $\bold{A}$.
- Column vectors are usually denoted by boldface lowercase letters, e.g. $\bold{x}$.
- The transpose operator is denoted by either a superscript T (e.g. $\bold{A}^\mathrm{T}$) or a prime symbol (e.g. $\bold{A}'$).
- A row vector is written as the transpose of a column vector, e.g. $\bold{x}^\mathrm{T}$ or $\bold{x}'$.

==Abbreviations==

Common abbreviations include:
- a.e. almost everywhere
- a.s. almost surely
- cdf cumulative distribution function
- cmf cumulative mass function
- df degrees of freedom (also $\nu$)
- i.i.d. independent and identically distributed
- pdf probability density function
- pmf probability mass function
- r.v. random variable
- w.p. with probability; wp1 with probability 1
- i.o. infinitely often, i.e. $\{ A_n\text{ i.o.} \} = \bigcap_N\bigcup_{n\geq N} A_n$
- ult. ultimately, i.e. $\{ A_n \text{ ult.} \} = \bigcup_N\bigcap_{n\geq N} A_n$

== See also ==
- Glossary of probability and statistics
- Combinations and permutations
- History of mathematical notation
