# Cronbach's alpha

Tau-equivalent reliability (${\displaystyle {\rho }_{T}}$), also known as Cronbach's alpha or coefficient alpha, is the most common test score reliability coefficient for single administration (i.e., the reliability of persons over items holding occasion fixed).[1][2][3]

Recent studies recommend not using it unconditionally.[4][5][6][7][8][9] Reliability coefficients based on structural equation modeling (SEM) are often recommended as its alternative.

## Formula and calculation

### Systematic and conventional formula

Let ${\displaystyle X_{i}}$ denote the observed score of item ${\displaystyle i}$ and ${\displaystyle X=(X_{1}+X_{2}+\cdots +X_{k})}$ denote the sum of all items in a test consisting of ${\displaystyle k}$ items. Let ${\displaystyle \sigma _{ij}}$ denote the covariance between ${\displaystyle X_{i}}$ and ${\displaystyle X_{j}}$, ${\displaystyle \sigma _{i}^{2}(=\sigma _{ii})}$ denote the variance of ${\displaystyle X_{i}}$, and ${\displaystyle \sigma _{X}^{2}}$ denote the variance of ${\displaystyle X}$. ${\displaystyle \sigma _{X}^{2}}$ consists of item variances and inter-item covariances:

${\displaystyle \sigma _{X}^{2}=\sum _{i=1}^{k}\sum _{j=1}^{k}\sigma _{ij}=\sum _{i=1}^{k}\sigma _{i}^{2}+\sum _{i=1}^{k}\sum _{j\neq {i}}^{k}\sigma _{ij}}$.

Let ${\displaystyle {\overline {\sigma _{ij}}}}$ denote the average of the inter-item covariances:

${\displaystyle {\overline {\sigma _{ij}}}={\sum _{i=1}^{k}\sum _{j\neq {i}}^{k}\sigma _{ij} \over k(k-1)}}$.

${\displaystyle {\rho }_{T}}$'s "systematic"[3] formula is

${\displaystyle \rho _{T}={k^{2}{\overline {\sigma _{ij}}} \over \sigma _{X}^{2}}}$.

The more frequently used version of the formula is

${\displaystyle {\rho }_{T}={{k} \over {k-1}}\left(1-{{\sum _{i=1}^{k}\sigma _{i}^{2}} \over {\sigma _{X}^{2}}}\right)}$.

### Calculation example

#### When applied to appropriate data

${\displaystyle {\rho }_{T}}$ is applied to the following data that satisfies the condition of being tau-equivalent.

Observed covariance matrix
${\displaystyle X_{1}}$ ${\displaystyle X_{2}}$ ${\displaystyle X_{3}}$ ${\displaystyle X_{4}}$
${\displaystyle X_{1}}$ ${\displaystyle 10}$ ${\displaystyle 6}$ ${\displaystyle 6}$ ${\displaystyle 6}$
${\displaystyle X_{2}}$ ${\displaystyle 6}$ ${\displaystyle 11}$ ${\displaystyle 6}$ ${\displaystyle 6}$
${\displaystyle X_{3}}$ ${\displaystyle 6}$ ${\displaystyle 6}$ ${\displaystyle 12}$ ${\displaystyle 6}$
${\displaystyle X_{4}}$ ${\displaystyle 6}$ ${\displaystyle 6}$ ${\displaystyle 6}$ ${\displaystyle 13}$

${\displaystyle k=4}$, ${\displaystyle {\overline {\sigma _{ij}}}=6}$,

${\displaystyle \sigma _{X}^{2}=\sum _{i=1}^{k}\sigma _{i}^{2}+\sum _{i=1}^{k}\sum _{j\neq {i}}^{k}\sigma _{ij}=(10+11+12+13)+4*(4-1)*6=118}$,

and ${\displaystyle \rho _{T}={4^{2}*6 \over 118}=.8135}$.

#### When applied to inappropriate data

${\displaystyle {\rho }_{T}}$ is applied to the following data that does not satisfy the condition of being tau-equivalent.

Observed covariance matrix
${\displaystyle X_{1}}$ ${\displaystyle X_{2}}$ ${\displaystyle X_{3}}$ ${\displaystyle X_{4}}$
${\displaystyle X_{1}}$ ${\displaystyle 10}$ ${\displaystyle 4}$ ${\displaystyle 5}$ ${\displaystyle 7}$
${\displaystyle X_{2}}$ ${\displaystyle 4}$ ${\displaystyle 11}$ ${\displaystyle 6}$ ${\displaystyle 8}$
${\displaystyle X_{3}}$ ${\displaystyle 5}$ ${\displaystyle 6}$ ${\displaystyle 12}$ ${\displaystyle 9}$
${\displaystyle X_{4}}$ ${\displaystyle 7}$ ${\displaystyle 8}$ ${\displaystyle 9}$ ${\displaystyle 13}$

${\displaystyle k=4}$, ${\displaystyle {\overline {\sigma _{ij}}}=(4+5+6+7+8+9)/6=6.5}$,

${\displaystyle \sigma _{X}^{2}=(10+11+12+13)+2*(4+5+6+7+8+9)=124}$,

and ${\displaystyle \rho _{T}={4^{2}*6.5 \over 124}=.8387}$.

Compare this value with the value of applying congeneric reliability to the same data.

## Prerequisites for using tau-equivalent reliability

In order to use ${\displaystyle {\rho }_{T}}$ as a reliability coefficient, the data must satisfy the following conditions.

1) Unidimensionality;

2) (Essential) tau-equivalence;

3) Independence between errors.

### The conditions of being parallel, tau-equivalent, and congeneric

#### Parallel condition

At the population level, parallel data have equal inter-item covariances (i.e., non-diagonal elements of the covariance matrix) and equal variances (i.e., diagonal elements of the covariance matrix). For example, the following data satisfy the parallel condition. In parallel data, even if a correlation matrix is used instead of a covariance matrix, there is no loss of information. All parallel data are also tau-equivalent, but the reverse is not true. That is, among the three conditions, the parallel condition is most difficult to meet.

Observed covariance matrix
${\displaystyle X_{1}}$ ${\displaystyle X_{2}}$ ${\displaystyle X_{3}}$ ${\displaystyle X_{4}}$
${\displaystyle X_{1}}$ ${\displaystyle 10}$ ${\displaystyle 6}$ ${\displaystyle 6}$ ${\displaystyle 6}$
${\displaystyle X_{2}}$ ${\displaystyle 6}$ ${\displaystyle 10}$ ${\displaystyle 6}$ ${\displaystyle 6}$
${\displaystyle X_{3}}$ ${\displaystyle 6}$ ${\displaystyle 6}$ ${\displaystyle 10}$ ${\displaystyle 6}$
${\displaystyle X_{4}}$ ${\displaystyle 6}$ ${\displaystyle 6}$ ${\displaystyle 6}$ ${\displaystyle 10}$

#### Tau-equivalent condition

A tau-equivalent measurement model is a special case of a congeneric measurement model, hereby assuming all factor loadings to be the same, i.e. ${\displaystyle \lambda =\lambda _{1}=\lambda _{2}=\lambda _{3}=\cdots =\lambda _{k}}$

At the population level, tau-equivalent data have equal covariances, but their variances may have different values. For example, the following data satisfies the condition of being tau-equivalent. All items in tau-equivalent data have equal discrimination or importance. All tau-equivalent data are also congeneric, but the reverse is not true.

Observed covariance matrix
${\displaystyle X_{1}}$ ${\displaystyle X_{2}}$ ${\displaystyle X_{3}}$ ${\displaystyle X_{4}}$
${\displaystyle X_{1}}$ ${\displaystyle 10}$ ${\displaystyle 6}$ ${\displaystyle 6}$ ${\displaystyle 6}$
${\displaystyle X_{2}}$ ${\displaystyle 6}$ ${\displaystyle 12}$ ${\displaystyle 6}$ ${\displaystyle 6}$
${\displaystyle X_{3}}$ ${\displaystyle 6}$ ${\displaystyle 6}$ ${\displaystyle 9}$ ${\displaystyle 6}$
${\displaystyle X_{4}}$ ${\displaystyle 6}$ ${\displaystyle 6}$ ${\displaystyle 6}$ ${\displaystyle 10}$

#### Congeneric condition

Congeneric measurement model

At the population level, congeneric data need not have equal variances or covariances, provided they are unidimensional. For example, the following data meet the condition of being congeneric. All items in congeneric data can have different discrimination or importance.

Observed covariance matrix
${\displaystyle X_{1}}$ ${\displaystyle X_{2}}$ ${\displaystyle X_{3}}$ ${\displaystyle X_{4}}$
${\displaystyle X_{1}}$ ${\displaystyle 5}$ ${\displaystyle 4}$ ${\displaystyle 3}$ ${\displaystyle 2}$
${\displaystyle X_{2}}$ ${\displaystyle 4}$ ${\displaystyle 20}$ ${\displaystyle 12}$ ${\displaystyle 8}$
${\displaystyle X_{3}}$ ${\displaystyle 3}$ ${\displaystyle 12}$ ${\displaystyle 13}$ ${\displaystyle 6}$
${\displaystyle X_{4}}$ ${\displaystyle 2}$ ${\displaystyle 8}$ ${\displaystyle 6}$ ${\displaystyle 8}$

## Relationship with other reliability coefficients

### Classification of single-administration reliability coefficients

#### Conventional names

There are numerous reliability coefficients. Among them, the conventional names of reliability coefficients that are related and frequently used are summarized as follows:[3]

Conventional names of reliability coefficients
Split-half Unidimensional Multidimensional
Parallel Spearman-Brown formula Standardized ${\displaystyle \alpha }$ (No conventional name)
Tau-equivalent Flanagan formula
Rulon formula
Flanagan-Rulon formula
Guttman's ${\displaystyle \lambda _{4}}$
Cronbach's ${\displaystyle \alpha }$
coefficient ${\displaystyle \alpha }$
Guttman's ${\displaystyle \lambda _{3}}$
KR-20
Hoyt reliability
Stratified ${\displaystyle \alpha }$
Congeneric Angoff-Feldt coefficient
Raju(1970) coefficient
composite reliability
construct reliability
congeneric reliability
coefficient ${\displaystyle \omega }$
unidimensional ${\displaystyle \omega }$
Raju(1977) coefficient
coefficient ${\displaystyle \omega }$
${\displaystyle \omega }$ total
McDonald's ${\displaystyle \omega }$
multidimensional ${\displaystyle \omega }$

Combining row and column names gives the prerequisites for the corresponding reliability coefficient. For example, Cronbach's ${\displaystyle \alpha }$ and Guttman's ${\displaystyle \lambda _{3}}$ are reliability coefficients derived under the condition of being unidimensional and tau-equivalent.

#### Systematic names

Conventional names are disordered and unsystematic. Conventional names give no information about the nature of each coefficient, or give misleading information (e.g., composite reliability). Conventional names are inconsistent. Some are formulas, and others are coefficients. Some are named after the original developer, some are named after someone who is not the original developer, and others do not include the name of any person. While one formula is referred to by multiple names, multiple formulas are referred to by one notation (e.g., alphas and omegas). The proposed systematic names and their notation for these reliability coefficients are as follows: [3]

Systematic names of reliability coefficients
Split-half Unidimensional Multidimensional
Parallel split-half parallel reliability(${\displaystyle \rho _{SP}}$) parallel reliability(${\displaystyle \rho _{P}}$) multidimensional parallel reliability(${\displaystyle \rho _{MP}}$)
Tau-equivalent split-half tau-equivalent reliability(${\displaystyle \rho _{ST}}$) tau-equivalent reliability(${\displaystyle \rho _{T}}$) multidimensional tau-equivalent reliability(${\displaystyle \rho _{MT}}$)
Congeneric split-half congeneric reliability (${\displaystyle \rho _{SC}}$) congeneric reliability (${\displaystyle \rho _{C}}$) Bifactor model
Bifactor reliability(${\displaystyle \rho _{BF}}$)
Second-order factor model
Second-order factor reliability(${\displaystyle \rho _{SOF}}$)
Correlated factor model
Correlated factor reliability(${\displaystyle \rho _{CF}}$)

### Relationship with parallel reliability

${\displaystyle \rho _{T}}$ is often referred to as coefficient alpha and ${\displaystyle \rho _{P}}$ is often referred to as standardized alpha. Because of the standardized modifier, ${\displaystyle \rho _{P}}$ is often mistaken for a more standard version than ${\displaystyle \rho _{T}}$. There is no historical basis to refer to ${\displaystyle \rho _{P}}$ as standardized alpha. Cronbach (1951)[10] did not refer to this coefficient as alpha, nor did it recommend using it. ${\displaystyle \rho _{P}}$ was rarely used before the 1970s. As SPSS began to provide ${\displaystyle \rho _{P}}$ under the name of standardized alpha, this coefficient began to be used occasionally.[11] The use of ${\displaystyle \rho _{P}}$ is not recommended because the parallel condition is difficult to meet in real-world data.

### Relationship with split-half tau-equivalent reliability

${\displaystyle \rho _{T}}$ equals the average of the ${\displaystyle \rho _{ST}}$ values obtained for all possible split-halves. This relationship, proved by Cronbach (1951),[10] is often used to explain the intuitive meaning of ${\displaystyle \rho _{T}}$. However, this interpretation overlooks the fact that ${\displaystyle \rho _{ST}}$ underestimates reliability when applied to data that are not tau-equivalent. At the population level, the maximum of all possible ${\displaystyle \rho _{ST}}$ values is closer to reliability than the average of all possible ${\displaystyle \rho _{ST}}$ values.[7] This mathematical fact was already known even before the publication of Cronbach (1951).[12] A comparative study[13] reports that the maximum of ${\displaystyle \rho _{ST}}$ is the most accurate reliability coefficient.

Revelle (1979)[14] refers to the minimum of all possible ${\displaystyle \rho _{ST}}$ values as coefficient ${\displaystyle \beta }$, and recommends that ${\displaystyle \beta }$ provides complementary information that ${\displaystyle \rho _{T}}$ does not.[6]

### Relationship with congeneric reliability

If the assumptions of unidimensionality and tau-equivalence are satisfied, ${\displaystyle \rho _{T}}$ equals ${\displaystyle \rho _{C}}$.

If unidimensionality is satisfied but tau-equivalence is not satisfied, ${\displaystyle \rho _{T}}$ is smaller than ${\displaystyle \rho _{C}}$.[7]

${\displaystyle \rho _{C}}$ is the most commonly used reliability coefficient after ${\displaystyle \rho _{T}}$. Users tend to present both, rather than replacing ${\displaystyle \rho _{T}}$ with ${\displaystyle \rho _{C}}$.[3]

A study investigating studies that presented both coefficients reports that ${\displaystyle \rho _{T}}$ is .02 smaller than ${\displaystyle \rho _{C}}$ on average.[15]

### Relationship with multidimensional reliability coefficients and ${\displaystyle \omega _{T}}$

If ${\displaystyle \rho _{T}}$ is applied to multidimensional data, its value is smaller than multidimensional reliability coefficients and larger than ${\displaystyle \omega _{T}}$.[3]

### Relationship with Intraclass correlation

${\displaystyle \rho _{T}}$ is said to be equal to the stepped-up consistency version of the intraclass correlation coefficient, which is commonly used in observational studies. But this is only conditionally true. In terms of variance components, this condition is, for item sampling: if and only if the value of the item (rater, in the case of rating) variance component equals zero. If this variance component is negative, ${\displaystyle \rho _{T}}$ will underestimate the stepped-up intra-class correlation coefficient; if this variance component is positive, ${\displaystyle \rho _{T}}$ will overestimate this stepped-up intra-class correlation coefficient.

## History[11]

### Before 1937

${\displaystyle \rho _{SP}}$[16][17] was the only known reliability coefficient. The problem was that the reliability estimates depended on how the items were split in half (e.g., odd/even or front/back). Criticism was raised against this unreliability, but for more than 20 years no fundamental solution was found.[18]

### Kuder and Richardson (1937)

Kuder and Richardson (1937)[19] developed several reliability coefficients that could overcome the problem of ${\displaystyle \rho _{SP}}$. They did not give the reliability coefficients particular names. Equation 20 in their article is ${\displaystyle \rho _{T}}$. This formula is often referred to as Kuder-Richardson Formula 20, or KR-20. They dealt with cases where the observed scores were dichotomous (e.g., correct or incorrect), so the expression of KR-20 is slightly different from the conventional formula of ${\displaystyle \rho _{T}}$. A review of this paper reveals that they did not present a general formula because they did not need to, not because they were not able to. Let ${\displaystyle p_{i}}$ denote the correct answer ratio of item ${\displaystyle i}$, and ${\displaystyle q_{i}}$ denote the incorrect answer ratio of item ${\displaystyle i}$ (${\displaystyle p_{i}+q_{i}=1}$). The formula of KR-20 is as follows.

${\displaystyle {\rho }_{KR-20}={{k} \over {k-1}}\left(1-{{\sum _{i=1}^{k}p_{i}q_{i}} \over {\sigma _{X}^{2}}}\right)}$

Since ${\displaystyle p_{i}q_{i}=\sigma _{i}^{2}}$, KR-20 and ${\displaystyle \rho _{T}}$ have the same meaning.

### Between 1937 and 1951

#### Several studies published the general formula of KR-20

Kuder and Richardson (1937) made unnecessary assumptions to derive ${\displaystyle \rho _{T}}$. Several studies have derived ${\displaystyle \rho _{T}}$ in a different way from Kuder and Richardson (1937).

Hoyt (1941)[20] derived ${\displaystyle \rho _{T}}$ using ANOVA (Analysis of variance). Cyril Hoyt may be considered the first developer of the general formula of the KR-20, but he did not explicitly present the formula of ${\displaystyle \rho _{T}}$.

The first expression of the modern formula of ${\displaystyle \rho _{T}}$ appears in Jackson and Ferguson (1941).[21] The version they presented is as follows. Edgerton and Thompson (1942)[22] used the same version.

${\displaystyle {\rho }_{T}={{k} \over {k-1}}\left({\sigma _{X}^{2}-{\sum _{i=1}^{k}\sigma _{i}^{2}} \over {\sigma _{X}^{2}}}\right)}$

Guttman (1945)[12] derived six reliability formulas, each denoted by ${\displaystyle \lambda _{1},\cdots ,\lambda _{6}}$. Louis Guttman proved that all of these formulas were always less than or equal to reliability, and based on these characteristics, he referred to these formulas as 'lower bounds of reliability'. Guttman's ${\displaystyle \lambda _{4}}$ is ${\displaystyle \rho _{ST}}$, and ${\displaystyle \lambda _{3}}$ is ${\displaystyle \rho _{T}}$. He proved that ${\displaystyle \lambda _{2}}$ is always greater than or equal to ${\displaystyle \lambda _{3}}$ (i.e., more accurate). At that time, all calculations were done with paper and pencil, and since the formula of ${\displaystyle \lambda _{3}}$ was simpler to calculate, he mentioned that ${\displaystyle \lambda _{3}}$ was useful under certain conditions.

${\displaystyle \lambda _{3}={\rho }_{T}={{k} \over {k-1}}\left(1-{{\sum _{i=1}^{k}\sigma _{i}^{2}} \over {\sigma _{X}^{2}}}\right)}$

Gulliksen (1950)[23] derived ${\displaystyle \rho _{T}}$ with fewer assumptions than previous studies. The assumption he used is essential tau-equivalence in modern terms.

#### Recognition of KR-20's original formula and general formula at the time

The two formulas were recognized to be exactly identical, and the expression of general formula of KR-20 was not used. Hoyt[20] explained that his method "gives precisely the same result" as KR-20 (p.156). Jackson and Ferguson[21] stated that the two formulas are "identical"(p.74). Guttman[12] said ${\displaystyle \lambda _{3}}$ is "algebraically identical" to KR-20 (p.275). Gulliksen[23] also admitted that the two formulas are “identical”(p.224).

Even studies critical of KR-20 did not point out that the original formula of KR-20 could only be applied to dichotomous data.[24]

#### Criticism of underestimation of KR-20

Developers[19] of this formula reported that ${\displaystyle \rho _{T}}$ consistently underestimates reliability. Hoyt[25] argued that this characteristic alone made ${\displaystyle \rho _{T}}$ more recommendable than the traditional split-half technique, which was unknown whether to underestimate or overestimate reliability.

Cronbach (1943)[24] was critical of the underestimation of ${\displaystyle \rho _{T}}$. He was concerned that it was not known how much ${\displaystyle \rho _{T}}$ underestimated reliability. He criticized that the underestimation was likely to be excessively severe, such that ${\displaystyle \rho _{T}}$ could sometimes lead to negative values. Because of these problems, he argued that ${\displaystyle \rho _{T}}$ could not be recommended as an alternative to the split-half technique.

### Cronbach (1951)

As with previous studies,[20][12][21][23] Cronbach (1951)[10] invented another method to derive ${\displaystyle \rho _{T}}$. His interpretation was more intuitively attractive than those of previous studies. That is, he proved that ${\displaystyle \rho _{T}}$ equals the average of ${\displaystyle \rho _{ST}}$ values obtained for all possible split-halves. He criticized that the name KR-20 was weird and suggested a new name, coefficient alpha. His approach has been a huge success. However, he not only omitted some key facts, but also gave an incorrect explanation.

First, he positioned coefficient alpha as a general formula of KR-20, but omitted the explanation that existing studies had published the precisely identical formula. Those who read only Cronbach (1951) without background knowledge could misunderstand that he was the first to develop the general formula of KR-20.

Second, he did not explain under what condition ${\displaystyle \rho _{T}}$ equals reliability. Non-experts could misunderstand that ${\displaystyle \rho _{T}}$ was a general reliability coefficient that could be used for all data regardless of prerequisites.

Third, he did not explain why he changed his attitude toward ${\displaystyle \rho _{T}}$. In particular, he did not provide a clear answer to the underestimation problem of ${\displaystyle \rho _{T}}$, which he himself[24] had criticized.

Fourth, he argued that a high value of ${\displaystyle \rho _{T}}$ indicated homogeneity of the data.

### After 1951

Novick and Lewis (1967)[26] proved the necessary and sufficient condition for ${\displaystyle \rho _{T}}$ to be equal to reliability, and named it the condition of being essentially tau-equivalent.

Cronbach (1978)[2] mentioned that the reason Cronbach (1951) received a lot of citations was "mostly because [he] put a brand name on a common-place coefficient"(p.263).[3] He explained that he had originally planned to name other types of reliability coefficients (e.g., inter-rater reliability or test-retest reliability) in consecutive Greek letter (e.g., ${\displaystyle \beta }$, ${\displaystyle \gamma }$, ${\displaystyle \ldots }$), but later changed his mind.

Cronbach and Schavelson (2004)[27] encouraged readers to use generalizability theory rather than ${\displaystyle \rho _{T}}$. He opposed the use of the name Cronbach's alpha. He explicitly denied the existence of existing studies that had published the general formula of KR-20 prior to Cronbach (1951).

## Common misconceptions about tau-equivalent reliability[7]

### The value of tau-equivalent reliability ranges between zero and one

By definition, reliability cannot be less than zero and cannot be greater than one. Many textbooks mistakenly equate ${\displaystyle \rho _{T}}$ with reliability and give an inaccurate explanation of its range. ${\displaystyle \rho _{T}}$ can be less than reliability when applied to data that are not tau-equivalent. Suppose that ${\displaystyle X_{2}}$ copied the value of ${\displaystyle X_{1}}$ as it is, and ${\displaystyle X_{3}}$ copied by multiplying the value of ${\displaystyle X_{1}}$ by -1. The covariance matrix between items is as follows, ${\displaystyle \rho _{T}=-3}$.

Observed covariance matrix
${\displaystyle X_{1}}$ ${\displaystyle X_{2}}$ ${\displaystyle X_{3}}$
${\displaystyle X_{1}}$ ${\displaystyle 1}$ ${\displaystyle 1}$ ${\displaystyle -1}$
${\displaystyle X_{2}}$ ${\displaystyle 1}$ ${\displaystyle 1}$ ${\displaystyle -1}$
${\displaystyle X_{3}}$ ${\displaystyle -1}$ ${\displaystyle -1}$ ${\displaystyle 1}$

Negative ${\displaystyle \rho _{T}}$ can occur for reasons such as negative discrimination or mistakes in processing reversely scored items.

Unlike ${\displaystyle \rho _{T}}$, SEM-based reliability coefficients (e.g., ${\displaystyle \rho _{C}}$) are always greater than or equal to zero.

This anomaly was first pointed out by Cronbach (1943)[24] to criticize ${\displaystyle \rho _{T}}$, but Cronbach (1951)[10] did not comment on this problem in his article, which discussed all conceivable issues related ${\displaystyle \rho _{T}}$ and he himself[27] described as being "encyclopedic" (p.396).

### If there is no measurement error, the value of tau-equivalent reliability is one

This anomaly also originates from the fact that ${\displaystyle \rho _{T}}$ underestimates reliability. Suppose that ${\displaystyle X_{2}}$ copied the value of ${\displaystyle X_{1}}$ as it is, and ${\displaystyle X_{3}}$ copied by multiplying the value of ${\displaystyle X_{1}}$ by two. The covariance matrix between items is as follows, ${\displaystyle \rho _{T}=.9375}$.

Observed covariance matrix
${\displaystyle X_{1}}$ ${\displaystyle X_{2}}$ ${\displaystyle X_{3}}$
${\displaystyle X_{1}}$ ${\displaystyle 1}$ ${\displaystyle 1}$ ${\displaystyle 2}$
${\displaystyle X_{2}}$ ${\displaystyle 1}$ ${\displaystyle 1}$ ${\displaystyle 2}$
${\displaystyle X_{3}}$ ${\displaystyle 2}$ ${\displaystyle 2}$ ${\displaystyle 4}$

For the above data, both ${\displaystyle \rho _{P}}$ and ${\displaystyle \rho _{C}}$ have a value of one.

The above example is presented by Cho and Kim (2015).[7]

### A high value of tau-equivalent reliability indicates homogeneity between the items

Many textbooks refer to ${\displaystyle \rho _{T}}$ as an indicator of homogeneity between items. This misconception stems from the inaccurate explanation of Cronbach (1951)[10] that high ${\displaystyle \rho _{T}}$ values show homogeneity between the items. Homogeneity is a term that is rarely used in the modern literature, and related studies interpret the term as referring to unidimensionality. Several studies have provided proofs or counterexamples that high ${\displaystyle \rho _{T}}$ values do not indicate unidimensionality.[28][7][29][30][31][32] See counterexamples below.

Unidimensional data
${\displaystyle X_{1}}$ ${\displaystyle X_{2}}$ ${\displaystyle X_{3}}$ ${\displaystyle X_{4}}$ ${\displaystyle X_{5}}$ ${\displaystyle X_{6}}$
${\displaystyle X_{1}}$ ${\displaystyle 10}$ ${\displaystyle 3}$ ${\displaystyle 3}$ ${\displaystyle 3}$ ${\displaystyle 3}$ ${\displaystyle 3}$
${\displaystyle X_{2}}$ ${\displaystyle 3}$ ${\displaystyle 10}$ ${\displaystyle 3}$ ${\displaystyle 3}$ ${\displaystyle 3}$ ${\displaystyle 3}$
${\displaystyle X_{3}}$ ${\displaystyle 3}$ ${\displaystyle 3}$ ${\displaystyle 10}$ ${\displaystyle 3}$ ${\displaystyle 3}$ ${\displaystyle 3}$
${\displaystyle X_{4}}$ ${\displaystyle 3}$ ${\displaystyle 3}$ ${\displaystyle 3}$ ${\displaystyle 10}$ ${\displaystyle 3}$ ${\displaystyle 3}$
${\displaystyle X_{5}}$ ${\displaystyle 3}$ ${\displaystyle 3}$ ${\displaystyle 3}$ ${\displaystyle 3}$ ${\displaystyle 10}$ ${\displaystyle 3}$
${\displaystyle X_{6}}$ ${\displaystyle 3}$ ${\displaystyle 3}$ ${\displaystyle 3}$ ${\displaystyle 3}$ ${\displaystyle 3}$ ${\displaystyle 10}$

${\displaystyle \rho _{T}=.72}$ in the unidimensional data above.

Multidimensional data
${\displaystyle X_{1}}$ ${\displaystyle X_{2}}$ ${\displaystyle X_{3}}$ ${\displaystyle X_{4}}$ ${\displaystyle X_{5}}$ ${\displaystyle X_{6}}$
${\displaystyle X_{1}}$ ${\displaystyle 10}$ ${\displaystyle 6}$ ${\displaystyle 6}$ ${\displaystyle 1}$ ${\displaystyle 1}$ ${\displaystyle 1}$
${\displaystyle X_{2}}$ ${\displaystyle 6}$ ${\displaystyle 10}$ ${\displaystyle 6}$ ${\displaystyle 1}$ ${\displaystyle 1}$ ${\displaystyle 1}$
${\displaystyle X_{3}}$ ${\displaystyle 6}$ ${\displaystyle 6}$ ${\displaystyle 10}$ ${\displaystyle 1}$ ${\displaystyle 1}$ ${\displaystyle 1}$
${\displaystyle X_{4}}$ ${\displaystyle 1}$ ${\displaystyle 1}$ ${\displaystyle 1}$ ${\displaystyle 10}$ ${\displaystyle 6}$ ${\displaystyle 6}$
${\displaystyle X_{5}}$ ${\displaystyle 1}$ ${\displaystyle 1}$ ${\displaystyle 1}$ ${\displaystyle 6}$ ${\displaystyle 10}$ ${\displaystyle 6}$
${\displaystyle X_{6}}$ ${\displaystyle 1}$ ${\displaystyle 1}$ ${\displaystyle 1}$ ${\displaystyle 6}$ ${\displaystyle 6}$ ${\displaystyle 10}$

${\displaystyle \rho _{T}=.72}$ in the multidimensional data above.

Multidimensional data with extremely high reliability
${\displaystyle X_{1}}$ ${\displaystyle X_{2}}$ ${\displaystyle X_{3}}$ ${\displaystyle X_{4}}$ ${\displaystyle X_{5}}$ ${\displaystyle X_{6}}$
${\displaystyle X_{1}}$ ${\displaystyle 10}$ ${\displaystyle 9}$ ${\displaystyle 9}$ ${\displaystyle 8}$ ${\displaystyle 8}$ ${\displaystyle 8}$
${\displaystyle X_{2}}$ ${\displaystyle 9}$ ${\displaystyle 10}$ ${\displaystyle 9}$ ${\displaystyle 8}$ ${\displaystyle 8}$ ${\displaystyle 8}$
${\displaystyle X_{3}}$ ${\displaystyle 9}$ ${\displaystyle 9}$ ${\displaystyle 10}$ ${\displaystyle 8}$ ${\displaystyle 8}$ ${\displaystyle 8}$
${\displaystyle X_{4}}$ ${\displaystyle 8}$ ${\displaystyle 8}$ ${\displaystyle 8}$ ${\displaystyle 10}$ ${\displaystyle 9}$ ${\displaystyle 9}$
${\displaystyle X_{5}}$ ${\displaystyle 8}$ ${\displaystyle 8}$ ${\displaystyle 8}$ ${\displaystyle 9}$ ${\displaystyle 10}$ ${\displaystyle 9}$
${\displaystyle X_{6}}$ ${\displaystyle 8}$ ${\displaystyle 8}$ ${\displaystyle 8}$ ${\displaystyle 9}$ ${\displaystyle 9}$ ${\displaystyle 10}$

The above data have ${\displaystyle \rho _{T}=.9692}$, but are multidimensional.

Unidimensional data with unacceptably low reliability
${\displaystyle X_{1}}$ ${\displaystyle X_{2}}$ ${\displaystyle X_{3}}$ ${\displaystyle X_{4}}$ ${\displaystyle X_{5}}$ ${\displaystyle X_{6}}$
${\displaystyle X_{1}}$ ${\displaystyle 10}$ ${\displaystyle 1}$ ${\displaystyle 1}$ ${\displaystyle 1}$ ${\displaystyle 1}$ ${\displaystyle 1}$
${\displaystyle X_{2}}$ ${\displaystyle 1}$ ${\displaystyle 10}$ ${\displaystyle 1}$ ${\displaystyle 1}$ ${\displaystyle 1}$ ${\displaystyle 1}$
${\displaystyle X_{3}}$ ${\displaystyle 1}$ ${\displaystyle 1}$ ${\displaystyle 10}$ ${\displaystyle 1}$ ${\displaystyle 1}$ ${\displaystyle 1}$
${\displaystyle X_{4}}$ ${\displaystyle 1}$ ${\displaystyle 1}$ ${\displaystyle 1}$ ${\displaystyle 10}$ ${\displaystyle 1}$ ${\displaystyle 1}$
${\displaystyle X_{5}}$ ${\displaystyle 1}$ ${\displaystyle 1}$ ${\displaystyle 1}$ ${\displaystyle 1}$ ${\displaystyle 10}$ ${\displaystyle 1}$
${\displaystyle X_{6}}$ ${\displaystyle 1}$ ${\displaystyle 1}$ ${\displaystyle 1}$ ${\displaystyle 1}$ ${\displaystyle 1}$ ${\displaystyle 10}$

The above data have ${\displaystyle \rho _{T}=.4}$, but are unidimensional.

Unidimensionality is a prerequisite for ${\displaystyle \rho _{T}}$. You should check unidimensionality before calculating ${\displaystyle \rho _{T}}$, rather than calculating ${\displaystyle \rho _{T}}$ to check unidimensionality.[3]

### A high value of tau-equivalent reliability indicates internal consistency

The term internal consistency is commonly used in the reliability literature, but its meaning is not clearly defined. The term is sometimes used to refer to a certain kind of reliability (e.g., internal consistency reliability), but it is unclear exactly which reliability coefficients are included here, in addition to ${\displaystyle \rho _{T}}$. Cronbach (1951)[10] used the term in several senses without an explicit definition. Cho and Kim (2015)[7] showed that ${\displaystyle \rho _{T}}$ is not an indicator of any of these.

### Removing items using "alpha if item deleted" always increases reliability

Removing an item using "alpha if item deleted" may result in 'alpha inflation,' where sample-level reliability is reported to be higher than population-level reliability.[33] It may also reduce population-level reliability.[34] The elimination of less-reliable items should be based not only on a statistical basis, but also on a theoretical and logical basis. It is also recommended that the whole sample be divided into two and cross-validated.[33]

## Ideal reliability level and how to increase reliability

### Nunnally's recommendations for the level of reliability

The most frequently cited source of how much reliability coefficients should be is Nunnally's book.[35][36][37] However, his recommendations are cited contrary to his intentions. What he meant was to apply different criteria depending on the purpose or stage of the study. However, regardless of the nature of the research, such as exploratory research, applied research, and scale development research, a criterion of .7 is universally used.[38] .7 is the criterion he recommended for the early stages of a study, which most studies published in the journal are not. Rather than .7, the criterion of .8 referred to applied research by Nunnally is more appropriate for most empirical studies.[38]

Nunnally's recommendations on the level of reliability
1st edition[35] 2nd[36] & 3rd[37] edition
Early stage of research .5 or .6 .7
Applied research .8 .8
When making important decisions .95 (minimum .9) .95 (minimum .9)

His recommendation level did not imply a cutoff point. If a criterion means a cutoff point, it is important whether or not it is met, but it is unimportant how much it is over or under. He did not mean that it should be strictly .8 when referring to the criteria of .8. If the reliability has a value near .8 (e.g., 0.78), it can be considered that his recommendation has been met.[39]

His idea was that there is a cost to increasing reliability, so there is no need to try to obtain maximum reliability in every situation.

### Cost to obtain a high level of reliability

Many textbooks explain that the higher the value of reliability, the better. The potential side effects of high reliability are rarely discussed. However, the principle of sacrificing something to get one also applies to reliability.

#### Trade-off between reliability and validity

Measurements with perfect reliability lack validity.[7] For example, a person who take the test with the reliability of one will get a perfect score or a zero score, because the examinee who give the correct answer or incorrect answer on one item will give the correct answer or incorrect answer on all other items. The phenomenon in which validity is sacrificed to increase reliability is called attenuation paradox.[40][41]

A high value of reliability can be in conflict with content validity. For high content validity, each item should be constructed to be able to comprehensively represent the content to be measured. However, a strategy of repeatedly measuring essentially the same question in different ways is often used only for the purpose of increasing reliability.[42][43]

#### Trade-off between reliability and efficiency

When the other conditions are equal, reliability increases as the number of items increases. However, the increase in the number of items hinders the efficiency of measurements.

### Methods to increase reliability

Despite the costs associated with increasing reliability discussed above, a high level of reliability may be required. The following methods can be considered to increase reliability.

#### Before data collection

Eliminate the ambiguity of the measurement item.

Do not measure what the respondents do not know.

Increase the number of items. However, care should be taken not to excessively inhibit the efficiency of the measurement.

Use a scale that is known to be highly reliable.[44]

Conduct a pretest. Discover in advance the problem of reliability.

Exclude or modify items that are different in content or form from other items (e.g., reversely scored items).

#### After data collection

Remove the problematic items using "alpha if item deleted". However, this deletion should be accompanied by a theoretical rationale.

Use a more accurate reliability coefficient than ${\displaystyle \rho _{T}}$. For example, ${\displaystyle \rho _{C}}$ is .02 larger than ${\displaystyle \rho _{T}}$ on average.[15]

## Which reliability coefficient to use

### Should we continue to use tau-equivalent reliability?

${\displaystyle {\rho }_{T}}$ is used in an overwhelming proportion. A study estimates that approximately 97% of studies use ${\displaystyle {\rho }_{T}}$ as a reliability coefficient.[3]

However, simulation studies comparing the accuracy of several reliability coefficients have led to the common result that ${\displaystyle {\rho }_{T}}$ is an inaccurate reliability coefficient.[45][13][6][46][47]

Methodological studies are critical of the use of ${\displaystyle {\rho }_{T}}$. Simplifying and classifying the conclusions of existing studies are as follows.

(1) Conditional use: Use ${\displaystyle {\rho }_{T}}$ only when certain conditions are met.[3][7][9]

(2) Opposition to use: ${\displaystyle {\rho }_{T}}$ is inferior and should not be used. [48][5][49][6][4][50]

### Alternatives to tau-equivalent reliability

Existing studies are practically unanimous in that they oppose the widespread practice of using ${\displaystyle {\rho }_{T}}$ unconditionally for all data. However, different opinions are given on which reliability coefficient should be used instead of ${\displaystyle {\rho }_{T}}$.

Different reliability coefficients ranked first in each simulation study[45][13][6][46][47] comparing the accuracy of several reliability coefficients.[7]

The majority opinion is to use SEM-based reliability coefficients as an alternative to ${\displaystyle {\rho }_{T}}$.[3][7][48][5][49][9][6][50]

However, there is no consensus on which of the several SEM-based reliability coefficients (e.g., unidimensional or multidimensional models) is the best to use.

Some people suggest ${\displaystyle {\omega }_{H}}$[6] as an alternative, but ${\displaystyle {\omega }_{H}}$ shows information that is completely different from reliability. ${\displaystyle {\omega }_{H}}$ is a type of coefficient comparable to Revelle's ${\displaystyle \beta }$.[14][6] They do not substitute, but complement reliability.[3]

Among SEM-based reliability coefficients, multidimensional reliability coefficients are rarely used, and the most commonly used is ${\displaystyle {\rho }_{C}}$ .[3]

### Software for SEM-based reliability coefficients

General-purpose statistical software such as SPSS and SAS include a function to calculate ${\displaystyle {\rho }_{T}}$. Users who don't know the formula of ${\displaystyle {\rho }_{T}}$ have no problem in obtaining the estimates with just a few mouse clicks.

SEM software such as AMOS, LISREL, and MPLUS does not have a function to calculate SEM-based reliability coefficients. Users need to calculate the result by inputting it to the formula. To avoid this inconvenience and possible error, even studies reporting the use of SEM rely on ${\displaystyle {\rho }_{T}}$ instead of SEM-based reliability coefficients.[3] There are a few alternatives to automatically calculate SEM-based reliability coefficients.

1) R (free): The psych package [51] calculates various reliability coefficients.

2) EQS (paid):[52] This SEM software has a function to calculate reliability coefficients.

3) RelCalc (free):[3] Available with Microsoft Excel. ${\displaystyle \rho _{C}}$ can be obtained without the need for SEM software. Various multidimensional SEM reliability coefficients and various types of ${\displaystyle {\omega }_{H}}$ can be calculated based on the results of SEM software.

## Derivation of formula[3]

Assumption 1. The observed score of an item consists of the true score of the item and the error of the item, which is independent of the true score. ${\displaystyle X_{i}=T_{i}+e_{i}}$

Lemma. ${\displaystyle Cov(T_{i},e_{i})=Cov(T_{i},e_{j})=0,\forall i\neq j}$

Assumption 2. Errors are independent of each other. ${\displaystyle Cov(e_{i},e_{j})=0,\forall i\neq j}$

Assumption 3. (The assumption of being essentially tau-equivalent) The true score of an item consists of the true score common to all items and the constant of the item. ${\displaystyle T_{i}=\mu _{i}+t}$

Let ${\displaystyle T}$ denote the sum of the item true scores. ${\displaystyle T=\sum _{i=1}^{k}T_{i}}$

The variance of ${\displaystyle T}$ is called the true score variance.

Definition. Reliability is the ratio of true score variance to observed score variance. ${\displaystyle \rho ={\sigma _{T}^{2} \over \sigma _{X}^{2}}}$

The following relationship is established from the above assumptions.

${\displaystyle \sigma _{i}^{2}=Var(\mu _{i}+t+e_{i})=\sigma _{t}^{2}+\sigma _{e_{i}}^{2}(\because Var(\mu _{i})=Cov(t,e_{i})=Cov(\mu _{i},e_{i})=Cov(\mu _{i},t)=0)}$
${\displaystyle \sigma _{ij}=Cov(T_{i}+e_{i},T_{j}+e_{j})=\sigma _{t}^{2}(\because Cov(T_{i},e_{j})=Cov(T_{j},e_{i})=Cov(e_{i},e_{j})=0)}$

Therefore, the covariance matrix between items is as follows.

Observed covariance matrix
${\displaystyle X_{1}}$ ${\displaystyle X_{2}}$ ${\displaystyle \ldots }$ ${\displaystyle X_{k}}$
${\displaystyle X_{1}}$ ${\displaystyle \sigma _{t}^{2}+\sigma _{e_{1}}^{2}}$ ${\displaystyle \sigma _{t}^{2}}$ ${\displaystyle \ldots }$ ${\displaystyle \sigma _{t}^{2}}$
${\displaystyle X_{2}}$ ${\displaystyle \sigma _{t}^{2}}$ ${\displaystyle \sigma _{t}^{2}+\sigma _{e_{2}}^{2}}$ ${\displaystyle \ldots }$ ${\displaystyle \sigma _{t}^{2}}$
${\displaystyle \vdots }$ ${\displaystyle \vdots }$ ${\displaystyle \vdots }$ ${\displaystyle \ddots }$ ${\displaystyle \vdots }$
${\displaystyle X_{k}}$ ${\displaystyle \sigma _{t}^{2}}$ ${\displaystyle \sigma _{t}^{2}}$ ${\displaystyle \ldots }$ ${\displaystyle \sigma _{t}^{2}+\sigma _{e_{k}}^{2}}$

You can see that ${\displaystyle \sigma _{t}^{2}}$ equals the mean of the covariances between items. That is, ${\displaystyle \sigma _{t}^{2}={\overline {\sigma _{ij}}}}$

${\displaystyle \sigma _{T}^{2}=Var(\sum _{i=1}^{k}t)=k^{2}\sigma _{t}^{2}=k^{2}{\overline {\sigma _{ij}}}}$

Let ${\displaystyle \rho _{T}}$ denote the reliability when satisfying the above assumptions. ${\displaystyle \rho _{T}}$ is:

${\displaystyle \rho _{T}={k^{2}{\overline {\sigma _{ij}}} \over \sigma _{X}^{2}}}$

## References

1. ^ Cronbach, Lee J. (1951). "Coefficient alpha and the internal structure of tests". Psychometrika. Springer Science and Business Media LLC. 16 (3): 297–334. doi:10.1007/bf02310555. hdl:10983/2196. ISSN 0033-3123. S2CID 13820448.
2. ^ a b Cronbach, L. J. (1978). "Citation Classics" (PDF). Current Contents. 13: 263.
3. Cho, Eunseong (2016-07-08). "Making Reliability Reliable". Organizational Research Methods. SAGE Publications. 19 (4): 651–682. doi:10.1177/1094428116656239. ISSN 1094-4281. S2CID 124129255.
4. ^ a b Sijtsma, K. (2009). On the use, the misuse, and the very limited usefulness of Cronbach’s alpha. Psychometrika, 74(1), 107–120. https://doi.org/10.1007/s11336-008-9101-0
5. ^ a b c Green, S. B., & Yang, Y. (2009). Commentary on coefficient alpha: A cautionary tale. Psychometrika, 74(1), 121–135. https://doi.org/10.1007/s11336-008-9098-4
6. Revelle, W., & Zinbarg, R. E. (2009). Coefficients alpha, beta, omega, and the glb: Comments on Sijtsma. Psychometrika, 74(1), 145–154. https://doi.org/10.1007/s11336-008-9102-z
7. Cho, E., & Kim, S. (2015). Cronbach’s coefficient alpha: Well known but poorly understood. Organizational Research Methods, 18(2), 207–230. https://doi.org/10.1177/1094428114555994
8. ^ McNeish, D. (2017). Thanks coefficient alpha, we’ll take it from here. Psychological Methods, 23(3), 412–433. https://doi.org/10.1037/met0000144
9. ^ a b c Raykov, T., & Marcoulides, G. A. (2017). Thanks coefficient alpha, we still need you! Educational and Psychological Measurement, 79(1), 200–210. https://doi.org/10.1177/0013164417725127
10. Cronbach, L.J. (1951). Coefficient alpha and the internal structure of tests. Psychometrika, 16 (3), 297–334. https://doi.org/10.1007/BF02310555
11. ^ a b Cho, E. and Chun, S. (2018), Fixing a broken clock: A historical review of the originators of reliability coefficients including Cronbach's alpha. Survey Research, 19(2), 23–54.
12. ^ a b c d Guttman, L. (1945). A basis for analyzing test-retest reliability. Psychometrika, 10(4), 255–282. https://doi.org/10.1007/BF02288892
13. ^ a b c Osburn, H. G. (2000). Coefficient alpha and related internal consistency reliability coefficients. Psychological Methods, 5(3), 343–355. https://doi.org/10.1037/1082-989X.5.3.343
14. ^ a b Revelle, W. (1979). Hierarchical cluster analysis and the internal structure of tests. Multivariate Behavioral Research, 14(1), 57–74. https://doi.org/10.1207/s15327906mbr1401_4
15. ^ a b Peterson, R. A., & Kim, Y. (2013). On the relationship between coefficient alpha and composite reliability. Journal of Applied Psychology, 98(1), 194–198. https://doi.org/10.1037/a0030767
16. ^ Brown, W. (1910). Some experimental results in the correlation of mental abilities. British Journal of Psychology, 3(3), 296–322. https://doi.org/10.1111/j.2044-8295.1910.tb00207.x
17. ^ Spearman, C. (1910). Correlation calculated from faulty data. British Journal of Psychology, 3(3), 271–295. https://doi.org/10.1111/j.2044-8295.1910.tb00206.x
18. ^ Kelley, T. L. (1924). Note on the reliability of a test: A reply to Dr. Crum’s criticism. Journal of Educational Psychology, 15(4), 193–204. https://doi.org/10.1037/h0072471
19. ^ a b Kuder, G. F., & Richardson, M. W. (1937). The theory of the estimation of test reliability. Psychometrika, 2(3), 151–160. https://doi.org/10.1007/BF02288391
20. ^ a b c Hoyt, C. (1941). Test reliability estimated by analysis of variance. Psychometrika, 6(3), 153–160. https://doi.org/10.1007/BF02289270
21. ^ a b c Jackson, R. W. B., & Ferguson, G. A. (1941). Studies on the reliability of tests. University of Toronto Department of Educational Research Bulletin, 12, 132.
22. ^ Edgerton, H. A., & Thomson, K. F. (1942). Test scores examined with the lexis ratio. Psychometrika, 7(4), 281–288. https://doi.org/10.1007/BF02288629
23. ^ a b c Gulliksen, H. (1950). Theory of mental tests. John Wiley & Sons. https://doi.org/10.1037/13240-000
24. ^ a b c d Cronbach, L. J. (1943). On estimates of test reliability. Journal of Educational Psychology, 34(8), 485–494. https://doi.org/10.1037/h0058608
25. ^ Hoyt, C. J. (1941). Note on a simplified method of computing test reliability: Educational and Psychological Measurement, 1(1). https://doi.org/10.1177/001316444100100109
26. ^ Novick, M. R., & Lewis, C. (1967). Coefficient alpha and the reliability of composite measurements. Psychometrika, 32(1), 1–13. https://doi.org/10.1007/BF02289400
27. ^ a b Cronbach, L. J., & Shavelson, R. J. (2004). My Current Thoughts on Coefficient Alpha and Successor Procedures. Educational and Psychological Measurement, 64(3), 391–418. https://doi.org/10.1177/0013164404266386
28. ^ Cortina, J. M. (1993). What is coefficient alpha? An examination of theory and applications. Journal of Applied Psychology, 78(1), 98–104. https://doi.org/10.1037/0021-9010.78.1.98
29. ^ Green, S. B., Lissitz, R. W., & Mulaik, S. A. (1977). Limitations of coefficient alpha as an Index of test unidimensionality. Educational and Psychological Measurement, 37(4), 827–838. https://doi.org/10.1177/001316447703700403
30. ^ McDonald, R. P. (1981). The dimensionality of tests and items. The British Journal of Mathematical and Statistical Psychology, 34(1), 100–117. https://doi.org/10.1111/j.2044-8317.1981.tb00621.x
31. ^ Schmitt, N. (1996). Uses and abuses of coefficient alpha. Psychological Assessment, 8(4), 350–353. https://doi.org/10.1037/1040-3590.8.4.350
32. ^ Ten Berge, J. M. F., & Sočan, G. (2004). The greatest lower bound to the reliability of a test and the hypothesis of unidimensionality. Psychometrika, 69(4), 613–625. https://doi.org/10.1007/BF02289858
33. ^ a b Kopalle, P. K., & Lehmann, D. R. (1997). Alpha inflation? The impact of eliminating scale items on Cronbach’s alpha. Organizational Behavior and Human Decision Processes, 70(3), 189–197. https://doi.org/10.1006/obhd.1997.2702
34. ^ Raykov, T. (2007). Reliability if deleted, not ‘alpha if deleted’: Evaluation of scale reliability following component deletion. The British Journal of Mathematical and Statistical Psychology, 60(2), 201–216. https://doi.org/10.1348/000711006X115954
35. ^ a b Nunnally, J. C. (1967). Psychometric theory. New York, NY: McGraw-Hill.
36. ^ a b Nunnally, J. C. (1978). Psychometric theory (2nd ed.). New York, NY: McGraw-Hill.
37. ^ a b Nunnally, J. C., & Bernstein, I. H. (1994). Psychometric theory (3rd ed.). New York, NY: McGraw-Hill.
38. ^ a b Lance, C. E., Butts, M. M., & Michels, L. C. (2006). What did they really say? Organizational Research Methods, 9(2), 202–220. https://doi.org/10.1177/1094428105284919
39. ^ Cho, E. (2020). A comprehensive review of so-called Cronbach's alpha. Journal of Product Research, 38(1), 9–20.
40. ^ Loevinger, J. (1954). The attenuation paradox in test theory. Psychological Bulletin, 51(5), 493–504. https://doi.org/10.1002/j.2333-8504.1954.tb00485.x
41. ^ Humphreys, L. (1956). The normal curve and the attenuation paradox in test theory. Psychological Bulletin, 53(6), 472–476. https://doi.org/10.1037/h0041091
42. ^ Boyle, G. J. (1991). Does item homogeneity indicate internal consistency or item redundancy in psychometric scales? Personality and Individual Differences, 12(3), 291–294. https://doi.org/10.1016/0191-8869(91)90115-R
43. ^ Streiner, D. L. (2003). Starting at the beginning: An introduction to coefficient alpha and internal consistency. Journal of Personality Assessment, 80(1), 99–103. https://doi.org/10.1207/S15327752JPA8001_18
44. ^ Lee, H. (2017). Research Methodology (2nd ed.), Hakhyunsa.
45. ^ a b Kamata, A., Turhan, A., & Darandari, E. (2003). Estimating reliability for multidimensional composite scale scores. Annual Meeting of American Educational Research Association, Chicago, April 2003, April, 1–27.
46. ^ a b Tang, W., & Cui, Y. (2012). A simulation study for comparing three lower bounds to reliability. Paper Presented on April 17, 2012 at the AERA Division D: Measurement and Research Methodology, Section 1: Educational Measurement, Psychometrics, and Assessment., 1–25.
47. ^ a b van der Ark, L. A., van der Palm, D. W., & Sijtsma, K. (2011). A latent class approach to estimating test-score reliability. Applied Psychological Measurement, 35(5), 380–392. https://doi.org/10.1177/0146621610392911
48. ^ a b Dunn, T. J., Baguley, T., & Brunsden, V. (2014). From alpha to omega: A practical solution to the pervasive problem of internal consistency estimation. British Journal of Psychology, 105(3), 399–412. https://doi.org/10.1111/bjop.12046
49. ^ a b Peters, G. Y. (2014). The alpha and the omega of scale reliability and validity comprehensive assessment of scale quality. The European Health Psychologist, 1(2), 56–69.
50. ^ a b Yang, Y., & Green, S. B. (2011). Coefficient alpha: A reliability coefficient for the 21st century? Journal of Psychoeducational Assessment, 29(4), 377–392. https://doi.org/10.1177/0734282911406668
51. ^ http://personality-project.org/r/overview.pdf
52. ^ http://www.mvsoft.com/eqs60.htm