Multitrait-multimethod matrix

The multitrait-multimethod (MTMM) matrix is an approach to examining construct validity developed by Campbell and Fiske (1959).^[1] It organizes convergent and discriminant validity evidence for comparison of how a measure relates to other measures.

Definitions and key components

Multiple traits are used in this approach to examine (a) similar or (b) dissimilar traits ( constructs), as to establish convergent and discriminant validity between traits. Similarly, multiple methods are used in this approach to examine the differential effects (or lack thereof) caused by method specific variance.

There are six major considerations when examining a construct's validity through the MTMM matrix, which are as follows:

Evaluation of convergent validity – Tests designed to measure the same construct should correlate highly amongst themselves.
Evaluation of discriminant (divergent) validity – The construct being measured by a test should not correlate highly with different constructs.
Trait-method unit- Each task or test used in measuring a construct is considered a trait-method unit; in that the variance contained in the measure is part trait, and part method. Generally, researchers desire low method specific variance and high trait variance.
Multitrait-multimethod – More than one trait and more than one method must be used to establish (a) discriminant validity and (b) the relative contributions of the trait or method specific variance. This tenet is consistent with the ideas proposed in Platt's concept of Strong inference (1964).^[2]
Truly different methodology – When using multiple methods, one must consider how different the actual measures are. For instance, delivering two self-report measures are not truly different measures; whereas using an interview scale or a psychosomatic reading would be.
Trait characteristics – Traits should be different enough to be distinct, but similar enough to be worth examining in the MTMM.

Example

The example below provides a prototypical matrix and what the correlations between measures mean. The diagonal line is typically filled in with a reliability coefficient of the measure (e.g. alpha coefficient). Descriptions in brackets [] indicate what is expected when the validity of the construct (e.g., depression or anxiety) and the validities of the measures are all high.

Test	Beck Depression Inv	Hepner Depression Interview	Beck Anxiety Inv	Hepner Anxiety Interview
BDI	(Reliability Coefficient) [close to 1.00]
HDIv	Heteromethod-monotrait [highest of all except reliability]	(Reliability Coefficient) [close to 1.00]
BAI	Monomethod-heterotrait [low, less than monotrait]	Heteromethod-heterotrait [lowest of all]	(Reliability Coefficient) [close to 1.00]
HAIv	Heteromethod-heterotrait [lowest of all]	Monomethod-heterotrait [low, less than monotrait]	Heteromethod-monotrait [highest of all except reliability]	(Reliability Coefficient) [close to 1.00]

In this example the first row and the first column display the trait being assessed (i.e. anxiety or depression) as well as the method of assessing this trait (i.e. interview or survey as measured by fictitious measures). The term heteromethod indicates that in this cell the correlation between two separate methods is being reported. Monomethod indicates the opposite, in that the same method is being used (e.g. interview, interview). Heterotrait indicates that the cell is reporting two supposedly different traits. Monotrait indicates the opposite- that the same trait is being used.

In evaluating an actual matrix one wishes to examine the proportion of variance shared amongst traits and methods as to establish a sense of how much method specific variance is induced by the measurement method, as well as provide a look at how unique the trait is, as compared to another trait.

That is, for example, the trait should matter more than the specific method of measuring. For example, if a person is measured as being highly depressed by one measure, then another type of measure should also indicate that the person is highly depressed. On the other hand, people who appear highly depressed on the Beck Depression Inventory should not necessarily get high anxiety scores on Beck's Anxiety Inventory. Since the inventories were written by the same person, and are similar in style, there might be some correlation, but this similarity in method should not affect the scores much, so the correlations between these measures of different traits should be low.

Analysis

A variety of statistical approaches have been used to analyze the data from the MTMM matrix. The standard method from Campbell and Fiske can be implemented using the MTMM.EXE program available at: http://gim.med.ucla.edu/FacultyPages/Hays/utils/ One can also use confirmatory factor analysis^[3] due to the complexities in considering all of the data in the matrix. The Sawilowsky I test,^[4]^[5] however, considers all of the data in the matrix with a distribution-free statistical test for trend.

The test is conducted by reducing the heterotrait-heteromethod and heterotrait-monomethod triangles, and the validity and reliability diagonals, into a matrix of four levels. Each level consists of the minimum, median, and maximum value. The null hypothesis is these values are unordered, which is tested against the alternative hypothesis of an increasing ordered trend. The test statistic is found by counting the number of inversions (I). The critical value for alpha = 0.05 is 10, and for alpha = .01 is 14.

One of the most used models to analyze MTMM data is the True Score model proposed by Saris and Andrews (^[6]). The True Score model can be expressed using the following standardized equations:

    1)  $Y ij = r ij TS ij + e ij *$  where:
          $Y ij$  is the standardized observed variable measured with the i^th trait and j^th method.
          $r ij$  is the reliability coefficient, which is equal to:
            $r ij = σ Y ij / σ TS ij$  
          $TS ij$  is the standardized true score variable
          $e ij *$  is the random error, which is equal to:
            $e ij * = e ij / σ Y ij$ 
      
     Consequently:
        $r ij 2 = 1 - σ 2 (e ij *)$  where:
          $r ij 2$  is the reliability

    2)  $TS ij = v ij F i + m ij M j$  where:
          $v ij$  is the validity coefficient, which is equal to:
            $v ij = σ F i / σ TS ij$  
          $F i$  is the standardized latent factor for the i^th variable of interest (or trait)
          $m ij$  is the method effect, which is equal to:
          $m ij = σ M j / σ TS ij$ 
          $M j$  is the standardized latent factor for the reaction to the j^thmethod
      
     Consequently:
        $v ij 2 = 1 - m ij 2$  where:
          $v ij 2$  is the validity

    3)  $Y ij = q ij F i + r ij m ij M j + e*$  where:
          $q ij$  is the quality coefficient, which is equal to:
            $q ij = r ij • v ij$ 
        
     Consequently:
        $q ij 2 = r ij 2 • v ij 2 = σ 2 F i / σ 2 Y ij$  where:
          $q ij 2$  is the quality

The assumptions are the following:

     * The errors are random, thus the mean of the errors is zero:  $µ e = E (e) = 0$ 
     * The random errors are uncorrelated with each other:  $cov (e i, e j) = E (e i e j) = 0$ 
     * The random errors are uncorrelated with the independent variables:   $cov (TS, e) = E (TS e) = 0$ ,   $cov (F, e) = E (F e) = 0$  and   $cov (M, e) = E (M e) = 0$  
     * The method factors are assumed to be uncorrelated with one another and with the trait factors:  $cov (F, M) = E (F M) = 0$

Typically, the respondent must answer at least three different questions (i.e. traits) measured using at least three different methods. This model has been used to estimate the quality of thousands of survey questions, in particular in the frame of the European Social Survey. One can consult these estimates through the Survey Quality Predictor software, available at sqp.upf.edu.

References

^ Campbell, D.T., & FiskeD.W. (1959) Convergent and discriminant validation by the multitrait-multimethod matrix. Psychological Bulletin, 56, 81-105 "
^ John R. Platt (1964). "Strong inference". Science 146 (3642).
^ Figueredo, A., Ferketich, S., Knapp, T. (1991). Focus on psychometrics: More on MTMM: The Role of Confirmatory Factor Analysis. Nursing & Health, 14, 387-391
^ Sawilowsky, S. (2002). A quick distribution-free test for trend that contributes evidence of construct validity. Measurement and Evaluation in Counseling and Development, 35, 78-88.
^ Cuzzocrea, J., & Sawilowsky, S. (2009). Robustness to non-independence and power of the I test for trend in construct validity. Journal of Modern Applied Statistical Methods, 8(1), 215-225.
^ Saris, W. E. and Andrews, F. M. (1991). Evaluation of measurement instruments using a Structural Modeling Approach. Pp. 575 – 599 in Measurement errors in surveys, edited by Biemer, P. P. et al. New York: Wiley.

[1] Campbell, D.T., & FiskeD.W. (1959) Convergent and discriminant validation by the multitrait-multimethod matrix. Psychological Bulletin, 56, 81-105 "

[2] John R. Platt (1964). "Strong inference". Science 146 (3642).

[3] Figueredo, A., Ferketich, S., Knapp, T. (1991). Focus on psychometrics: More on MTMM: The Role of Confirmatory Factor Analysis. Nursing & Health, 14, 387-391

[4] Sawilowsky, S. (2002). A quick distribution-free test for trend that contributes evidence of construct validity. Measurement and Evaluation in Counseling and Development, 35, 78-88.

[5] Cuzzocrea, J., & Sawilowsky, S. (2009). Robustness to non-independence and power of the I test for trend in construct validity. Journal of Modern Applied Statistical Methods, 8(1), 215-225.

[6] Saris, W. E. and Andrews, F. M. (1991). Evaluation of measurement instruments using a Structural Modeling Approach. Pp. 575 – 599 in Measurement errors in surveys, edited by Biemer, P. P. et al. New York: Wiley.

[1]

[2]

[3]

[4]

[5]

[6]