RV coefficient

In statistics, the RV coefficient[1] is a multivariate generalization of the squared Pearson correlation coefficient (because the RV coefficient takes values between 0 and 1).[2] It measures the closeness of two set of points that may each be represented in a matrix.

The major approaches within statistical multivariate data analysis can all be brought into a common framework in which the RV coefficient is maximised subject to relevant constraints. Specifically, these statistical methodologies include:[1]

One application of the RV coefficient is in functional neuroimaging where it can measure the similarity between two subjects' series of brain scans[3] or between different scans of a same subject.[4]

Definitions

The definition of the RV-coefficient makes use of ideas[5] concerning the definition of scalar-valued quantities which are called the "variance" and "covariance" of vector-valued random variables. Note that standard usage is to have matrices for the variances and covariances of vector random variables. Given these innovative definitions, the RV-coefficient is then just the correlation coefficient defined in the usual way.

Suppose that X and Y are matrices of centered random vectors (column vectors) with covariance matrix given by

${\displaystyle \Sigma _{XY}=\operatorname {E} (X^{T}Y)\,,}$

then the scalar-valued covariance (denoted by COVV) is defined by[5]

${\displaystyle \operatorname {COVV} (X,Y)=\operatorname {Tr} (\Sigma _{XY}\Sigma _{YX})\,.}$

The scalar-valued variance is defined correspondingly:

${\displaystyle \operatorname {VAV} (X)=\operatorname {Tr} (\Sigma _{XX}^{2})\,.}$

With these definitions, the variance and covariance have certain additive properties in relation to the formation of new vector quantities by extending an existing vector with the elements of another.[5]

Then the RV-coefficient is defined by[5]

${\displaystyle \mathrm {RV} (X,Y)={\frac {\operatorname {COVV} (X,Y)}{\sqrt {\operatorname {VAV} (X)\operatorname {VAV} (Y)}}}\,.}$