Mixed-design analysis of variance

From Wikipedia, the free encyclopedia
Jump to navigation Jump to search

In statistics, a mixed-design analysis of variance model, also known as a split-plot ANOVA, is used to test for differences between two or more independent groups whilst subjecting participants to repeated measures. Thus, in a mixed-design ANOVA model, one factor (a fixed effects factor) is a between-subjects variable and the other (a random effects factor) is a within-subjects variable. Thus, overall, the model is a type of mixed-effects model.

A repeated measures design is used when multiple independent variables or measures exist in a data set, but all participants have been measured on each variable.[1]: 506 

An example[edit]

Andy Field (2009)[1] provided an example of a mixed-design ANOVA in which he wants to investigate whether personality or attractiveness is the most important quality for individuals seeking a partner. In his example, there is a speed dating event set up in which there are two sets of what he terms "stooge dates": a set of males and a set of females. The experimenter selects 18 individuals, 9 males and 9 females to play stooge dates. Stooge dates are individuals who are chosen by the experimenter and they vary in attractiveness and personality. For males and females, there are three highly attractive individuals, three moderately attractive individuals, and three highly unattractive individuals. Of each set of three, one individual has a highly charismatic personality, one is moderately charismatic and the third is extremely dull.

The participants are the individuals who sign up for the speed dating event and interact with each of the 9 individuals of the opposite sex. There are 10 males and 10 female participants. After each date, they rate on a scale of 0 to 100 how much they would like to have a date with that person, with a zero indicating "not at all" and 100 indicating "very much".

The random factors, or so-called repeated measures, are looks, which consists of three levels (very attractive, moderately attractive, and highly unattractive) and the personality, which again has three levels (highly charismatic, moderately charismatic, and extremely dull). The looks and personality have an overall random character because the precise level of each cannot be controlled by the experimenter (and indeed may be difficult to quantify[2]); the 'blocking' into discrete categories is for convenience, and does not guarantee precisely the same level of looks or personality within a given block;[3] and the experimenter is interested in making inferences on the general population of daters, not just the 18 'stooges'[4] The fixed-effect factor, or so-called between-subjects measure, is gender because the participants making the ratings were either female or male, and precisely these statuses were designed by the experimenter.

ANOVA assumptions[edit]

When running an analysis of variance to analyse a data set, the data set should meet the following criteria:

  1. Normality: scores for each condition should be sampled from a normally distributed population.
  2. Homogeneity of variance: each population should have the same error variance.
  3. Sphericity of the covariance matrix: ensures the F ratios match the F distribution

For the between-subject effects to meet the assumptions of the analysis of variance, the variance for any level of a group must be the same as the variance for the mean of all other levels of the group. When there is homogeneity of variance, sphericity of the covariance matrix will occur, because for between-subjects independence has been maintained.[5][page needed]

For the within-subject effects, it is important to ensure normality and homogeneity of variance are not being violated.[5][page needed]

If the assumptions are violated, a possible solution is to use the Greenhouse–Geisser correction[6] or the Huynh & Feldt[7] adjustments to the degrees of freedom because they can correct for issues that can arise should the sphericity of the covariance matrix assumption be violated.[5][page needed]

Partitioning the sums of squares and the logic of ANOVA[edit]

Due to the fact that the mixed-design ANOVA uses both between-subject variables and within-subject variables (a.k.a. repeated measures), it is necessary to partition out (or separate) the between-subject effects and the within-subject effects.[5] It is as if you are running two separate ANOVAs with the same data set, except that it is possible to examine the interaction of the two effects in a mixed design. As can be seen in the source table provided below, the between-subject variables can be partitioned into the main effect of the first factor and into the error term. The within-subjects terms can be partitioned into three terms: the second (within-subjects) factor, the interaction term for the first and second factors, and the error term.[5][page needed] The main difference between the sum of squares of the within-subject factors and between-subject factors is that within-subject factors have an interaction factor.

More specifically, the total sum of squares in a regular one-way ANOVA would consist of two parts: variance due to treatment or condition (SSbetween-subjects) and variance due to error (SSwithin-subjects). Normally the SSwithin-subjects is a measurement of variance. In a mixed-design, you are taking repeated measures from the same participants and therefore the sum of squares can be broken down even further into three components: SSwithin-subjects (variance due to being in different repeated measure conditions), SSerror (other variance), and SSBT*WT (variance of interaction of between-subjects by within-subjects conditions).[5]

Each effect has its own F value. Both the between-subject and within-subject factors have their own MSerror term which is used to calculate separate F values.


  • FBetween-subjects = MSbetween-subjects/MSError(between-subjects)


  • FWithin-subjects = MSwithin-subjects/MSError(within-subjects)
  • FBS×WS = MSbetween×within/MSError(within-subjects)

Analysis of variance table[edit]

Results are often presented in a table of the following form.[5][page needed]

Source SS df MS F
Total SST dfT

Degrees of freedom[edit]

In order to calculate the degrees of freedom for between-subjects effects, dfBS = R – 1, where R refers to the number of levels of between-subject groups.[5][page needed]

In the case of the degrees of freedom for the between-subject effects error, dfBS(Error) = Nk – R, where Nk is equal to the number of participants, and again R is the number of levels.

To calculate the degrees of freedom for within-subject effects, dfWS = C – 1, where C is the number of within-subject tests. For example, if participants completed a specific measure at three time points, C = 3, and dfWS = 2.

The degrees of freedom for the interaction term of between-subjects by within-subjects term(s), dfBSXWS = (R – 1)(C – 1), where again R refers to the number of levels of the between-subject groups, and C is the number of within-subject tests.

Finally, the within-subject error is calculated by, dfWS(Error) = (Nk – R)(C – 1), in which Nk is the number of participants, R and C remain the same.

Follow-up tests[edit]

When there is a significant interaction between a between-subject factor and a within-subject factor, statisticians often recommended pooling the between-subject and within-subject MSerror terms.[5][page needed][citation needed] This can be calculated in the following way:

MSWCELL = SSBSError + SSWSError / dfBSError + dfWSError

This pooled error is used when testing the effect of the between-subject variable within a level of the within-subject variable. If testing the within-subject variable at different levels of the between-subject variable, the MSws/e error term that tested the interaction is the correct error term to use. More generally, as described by Howell (1987 Statistical Methods for Psychology, 2nd edition, p 434), when doing simple effects based on the interactions one should use the pooled error when the factor being tested and the interaction were tested with different error terms. When the factor being tested and the interaction were tested with the same error term, that term is sufficient.

When following up interactions for terms that are both between-subjects or both within-subjects variables, the method is identical to follow-up tests in ANOVA. The MSError term that applies to the follow-up in question is the appropriate one to use, e.g. if following up a significant interaction of two between-subject effects, use the MSError term from between-subjects.[5][page needed] See ANOVA.

See also[edit]


  1. ^ a b Field, A. (2009). Discovering Statistics Using SPSS (3rd edition). Los Angeles: Sage.
  2. ^ Douglas C. Montgomery, Elizabeth A. Peck, and G. Geoffrey Vining; Introduction to Linear Regression Analysis; John Wiley & Sons, New York; 2001. Page 280.
  3. ^ Marianne Müller (ETH Zurich); Applied Analysis of Variance and Experimental Design, Lecture slides for week 4 (compiled 2011-10-25, delivered circa late 2013). Accessed 2019-01-23.
  4. ^ Gary W. Oehlert (University of Minnesota); A First Course in Design and Analysis of Experiments; self-published, USA; 2010. Page 289.
  5. ^ a b c d e f g h i j Howell, D. (2010). Statistical Methods for Psychology (7th edition). Australia: Wadsworth.
  6. ^ Geisser, S. and Greenhouse, S.W. (1958). An extension of Box's result on the use of the F distribution in multivariate analysis. Annals of Mathematical Statistics, 29, 885–891
  7. ^ Hyunh, H. and Feldt, L.S. (1970). Conditions under which mean square ratios in repeated measurements designs have exact F-distributions. Journal of the American Statistical Association, 65, 1582–1589

Further reading[edit]

  • Cauraugh, J. H. (2002). "Experimental design and statistical decisions tutorial: Comments on longitudinal ideomotor apraxia recovery." Neuropsychological Rehabilitation, 12, 75–83.
  • Gueorguieva, R. & Krystal, J. H. (2004). "Progress in analyzing repeated-measures data and its reflection in papers published in the archives of general psychiatry." Archives of General Psychiatry, 61, 310–317.
  • Huck, S. W. & McLean, R. A. (1975). "Using a repeated measures ANOVA to analyze the data from a pretest-posttest design: A potentially confusing task". Psychological Bulletin, 82, 511–518.
  • Pollatsek, A. & Well, A. D. (1995). "On the use of counterbalanced designs in cognitive research: A suggestion for a better and more powerful analysis". Journal of Experimental Psychology, 21, 785–794.

External links[edit]