Repeated measures design

From Wikipedia, the free encyclopedia
  (Redirected from Repeated-measures experiment)
Jump to: navigation, search

The repeated measures design (also known as a within-subjects design) uses the same subjects with every condition of the research, including the control.[1] For instance, repeated measures are collected in a longitudinal study in which change over time is assessed. Other studies compare the same measure under two or more different conditions. For instance, to test the effects of caffeine on cognitive function, a subject's math ability might be tested once after they consume caffeine and another time when they consume a placebo.

Contents

Crossover studies, an example of a repeated measures design [edit]

A popular repeated-measures design is the crossover study. A crossover study is a longitudinal study in which subjects receive a sequence of different treatments (or exposures). While crossover studies can be observational studies, many important crossover studies are controlled experiments, which are discussed in this article. Crossover designs are common for experiments in many scientific disciplines, for example psychology, education, pharmaceutical science, and health-care, especially medicine.

Randomized, controlled, crossover experiments are especially important in health-care. In a randomized clinical trial, the subjects are randomly assigned treatments. When the randomized clinical trial is a repeated measures design, the subjects are randomly assigned to a sequence of treatments. A crossover clinical trial is a repeated-measures design in which each patient is randomly assigned to a sequence of treatments, including at least two treatments (of which one "treatment" may be a standard treatment or a placebo): Thus each patient crosses over from one treatment to another.

Nearly all crossover designs have "balance", which means that all subjects should receive the same number of treatments and that all subjects participate for the same number of periods. In most crossover trials, each subject receives all treatments.

However, many repeated-measures designs are not crossover studies: The longitudinal study of the sequential effects of repeated treatments need not use any "crossover", for example (Vonesh & Chinchilli; Jones & Kenward).

Uses of a repeated measures design [edit]

  • Conduct an experiment when few participants are erect: The repeated measure design reduces the variance of estimates of treatment-effects, allowing statistical inference to be made with fewer subjects.
  • Conduct experiment more efficiently: Repeated measures designs allow many experiments to be completed more quickly, as only a few groups need to be trained to complete an entire experiment. For example, there are many experiments where each condition takes only a few minutes, whereas the training to complete the tasks take as much, if not more time.
  • Study changes in participants’ behavior over time: Repeated measures designs allow researchers to monitor how the participants change over the passage of time, both in the case of long-term situations like longitudinal studies and in the much shorter-term case of practice effects.

Practice effects [edit]

Practice effects occur when a participant in an experiment is able to perform a task and then perform it again at some later time. Generally, they either have a positive (subjects become better at performing the task) or negative (subjects become worse at performing the task) effect. Repeated measures designs are almost always affected by practice effects; the primary exception to this rule is in the case of a longitudinal study. How well these are measured is controlled by the exact type of repeated measure design that is used.

Both types, however, have the goal of controlling for practice effects.

Advantages and disadvantages [edit]

Advantages [edit]

The primary strengths of the repeated measures design is that it makes an experiment more efficient and helps keep the variability low. This helps to keep the validity of the results higher, while still allowing for smaller than usual subject groups.[2]

Disadvantages [edit]

A disadvantage to the repeated measure design is that it may not be possible for each participant to be in all conditions of the experiment (i.e. time constraints, location of experiment, etc.). Especially severely diseased subjects tend to drop out of a longitudinal study, removing these subjects would bias the design. In these cases mixed effects models would be preferable as they can deal with missing values.


There are also several threats to the internal validity of this design, namely a regression threat (when subjects are tested several times, their scores tend to regress towards the mean), a maturation threat (subjects may change during the course of the experiment) and a history threat (events outside the experiment that may change the response of subjects between the repeated measures).

Repeated Measures ANOVA [edit]

Repeated measures analysis of variance (rANOVA) is one of the most commonly used statistical approaches to repeated measures designs.[3] With such designs, the repeated-measure factor (the qualitative independent variable) is referred to as the within-subjects factor, while the dependent quantitative variable on which each participant is measured is referred to as the dependent variable.

Partitioning out Error [edit]

One of the greatest advantages to using the rANOVA, as is the case with repeated measures designs in general, is that you are able to partition out variability due to individual differences. Consider the general structure of the F-statistic:

F = MSTreatment / MSError = (SSTreatment/dfTreatment)/(SSError/dfError)

In a between-subjects design there is an element of variance due to individual difference that is combined in with the treatment and error terms:

SSTotal = SSTreatment + SSError

dfTotal = n-1

In a repeated measures design it is possible to account for these differences, and partition them out from the treatment and error terms. In such a case, the variability can be broken down into between-treatments variability (or within-subjects effects, excluding individual differences) and within-treatments variability. The within-treatments variability can be further partitioned into between-subjects variability (individual differences) and error (excluding the individual differences) [4]

SSTotal = SSTreatment (excluding individual difference) + SSSubjects + SSError

dfTotal = dfTreatment (within subjects) + dfbetween subjects + dferror = (k-1) + (n-1) + ((n-k)-(n-1))

In reference to the general structure of the F-statistic, it is clear that by partitioning out the between-subjects variability, the F-value will increase because the sum of squares error term will be smaller resulting in a smaller MSError. It is noteworthy that partitioning variability pulls out degrees of freedom from the F-test, therefore the between-subjects variability must be significant enough to offset the loss in degrees of freedom. If between-subjects variability is small this process may actually reduce the F-value.[4]

Assumptions [edit]

As with all statistical analyses, there are a number of assumptions that should be met to justify the use of this test. Violations to these assumptions can moderately to severely affect results, and often lead to an inflation of type 1 error. With the rANOVA, there are both standard univariate assumptions to be met, as well as multivariate assumptions.[5] The univariate assumptions are as follows:

1. Normality: For each level of the within-subjects factor, the dependent variable must have a normal distribution

2. Sphericity: Difference scores computed between two levels of a within-subjects factor must have the same variance for the comparison of any two levels. (This assumption only applies if there are more than 2 levels of the independent variable)

3. Randomness: Cases should be derived from a random sample, and the scores between participants should be independent from each other.

The rANOVA also requires that certain multivariate assumptions are met because a multivariate test is conducted on difference scores. These assumptions include:

1. Multivariate normality: The difference scores are multivariately normally distributed in the population

2. Randomness: Individual cases should be derived from a random sample, and the difference scores for each participant are independent from those of another participant.

F test [edit]

As with other analysis of variance tests, the rANOVA makes use of an F statistic to determine significance. Depending on the number of within-subjects factors and assumption violates, it is necessary to select the most appropriate of three tests:[5]

1. Standard Univariate ANOVA F test: This test is commonly used when there are only two levels of the within-subjects factor (ie. time point 1 and time point 2). This test is not recommended for use when there are more than 2 levels of the within-subjects factor because the assumption of sphericity is commonly violated in such cases.

2. Alternative Univariate test:[6] These tests account for violations to the assumption of sphericity, and can be used when the within-subjects factor exceeds 2 levels. The F statistic will be the same as in the Standard Univariate ANOVA F test, but is associated with a more accurate p-value. This correction is done by adjusting the df downward for determining the cirtical F value. Two corrections are commonly used: The Greenhouse-Geisser correction and the Huynh-Feldt correction. The Greenhouse-Geisser correction is more conservative, but addresses a common issues of increasing variability over time in a repeated-measures design.[7] The Huynh-Feldt correction is less conservative, but does not address issues of increasing variability. It has been suggested that lower Huynh-Feldt be used with smaller departures from sphericity, while Greenhouse-Geisser be used when the departures are very large.

3. Multivariate Test: This test does not assume sphericity, but is also highly conservative.

Effect Size [edit]

One of the most commonly reported effect size statistics for rANOVA is partial eta-squared (ηp2). It is also common to use the multivariate η2 when the assumption of sphericity has been violated, and the multivariate test statistic is reported. A third effect size statistic that is reported is the generalized η2, which is comparable to ηp2 in a one-way repreated measures ANOVA, it has been shown to be a better estimate of effect size with other within-subjects tests[8][9]

Cautions [edit]

While there are many advantages to repeated-measures design, the repeated measures ANOVA is not always the best statistical analyses to conduct. The rANOVA is still highly vulnerable to effects from missing values, imputation, unequivalent time points between subjects, and violations of sphericity.[10] These issues can result in sampling bias and inflated rates of Type I error.[11] In such cases it may be better to consider use of a linear mixed model.[12]

Notes [edit]

  1. ^ http://www.experiment-resources.com/repeated-measures-design.html#ixzz1cl4ahmlq
  2. ^ Minke, 1997
  3. ^ Gueorguieva; Krystal (2004). Arch Gen Psychiatry 61. 
  4. ^ a b Howell, David C. (2010). Statistical methods for psychology (7th ed. ed.). Belmont, CA: Thomson Wadsworth. ISBN 978-0-495-59784-1. 
  5. ^ a b Salkind, Samuel B. Green, Neil J. Using SPSS for Windows and Macintosh : analyzing and understanding data (6th ed. ed.). Boston: Prentice Hall. ISBN 978-0-205-02040-9. 
  6. ^ Vasey; Thayer (1987). Psychophysiology 3. 
  7. ^ Park (1993). "A comparison of the generalized estimating equation approach with the maximum likelihood approach for repeated measurements". Stat Med 12: 1723–1732. 
  8. ^ Bakeman (2005). "Recommended effect size statistics for repeated measures designs". Behavior Research Methods 37 (3): 379–384. 
  9. ^ Olejnik; Algina (2003). "Generalized eta and omega squared statistics: Measures of effect size for some common research designs.". Psychological Methods 8: 434–447. 
  10. ^ Gueorguieva; Krystal (2004). "More Over ANOVA". Arch Gen Psychiatry 61: 310–317. 
  11. ^ Muller; Barton (1989). "Approximate Power for Repeated -Measures ANOVA lacking sphericity". Journal of the American Statistical Association 84 (406): 549–555. 
  12. ^ Kreuger; Tian (2004). "A comparison of the general linear mixed model and repeated measures ANOVA using a dataset with multiple missing data points". Biological Research for Nursing 6: 151–157. 

References [edit]

Design and analysis of experiments [edit]

  • Jones, Byron; Kenward, Michael G. (2003). Design and Analysis of Cross-Over Trials (Second ed.). London: Chapman and Hall. 
  • Vonesh, Edward F. and Chinchilli, Vernon G. (1997). Linear and Nonlinear Models for the Analysis of Repeated Measurements. London: Chapman and Hall. 

Exploration of longitudinal data [edit]

  • Davidian, Marie; David M. Giltinan (1995). Nonlinear Models for Repeated Measurement Data. Chapman & Hall/CRC Monographs on Statistics & Applied Probability. ISBN 0412983419. 
  • Fitzmaurice, Garrett, Davidian, Marie, Verbeke, Geert and Molenberghs, Geert, ed. (2008). Longitudinal Data Analysis. Boca Raton, FL: Chapman and Hall/CRC. ISBN 1-58488-658-7. 
  • Jones, Byron; Kenward, Michael G. (2003). Design and Analysis of Cross-Over Trials (Second ed.). London: Chapman and Hall. 
  • Kim, Kevin and Timm, Neil (2007). ""Restricted MGLM and growth curve model" (Chapter 7)". Univariate and multivariate general linear models: Theory and applications with SAS (with 1 CD-ROM for Windows and UNIX). Statistics: Textbooks and Monographs (Second ed.). Boca Raton, FL: Chapman & Hall/CRC. ISBN 1-58488-634-X. 
  • Kollo, Tõnu and von Rosen, Dietrich (2005). ""Multivariate linear models" (chapter 4), especially "The Growth curve model and extensions" (Chapter 4.1)". Advanced multivariate statistics with matrices. Mathematics and its applications 579. Dordrecht: Springer. ISBN 1-4020-3418-0. 
  • Kshirsagar, Anant M. and Smith, William Boyce (1995). Growth curves. Statistics: Textbooks and Monographs 145. New York: Marcel Dekker, Inc. ISBN 0-8247-9341-2. 
  • Pan, Jian-Xin and Fang, Kai-Tai (2002). Growth curve models and statistical diagnostics. Springer Series in Statistics. New York: Springer-Verlag. ISBN 0-387-95053-2. 
  • Seber, G. A. F. and Wild, C. J. (1989). ""Growth models (Chapter 7)"". Nonlinear regression. Wiley Series in Probability and Mathematical Statistics: Probability and Mathematical Statistics. New York: John Wiley & Sons, Inc. pp. 325–367. ISBN 0-471-61760-1. 
  • Timm, Neil H. (2002). ""The general MANOVA model (GMANOVA)" (Chapter 3.6.d)". Applied multivariate analysis. Springer Texts in Statistics. New York: Springer-Verlag. ISBN 0-387-95347-7. 
  • Vonesh, Edward F. and Chinchilli, Vernon G. (1997). Linear and Nonlinear Models for the Analysis of Repeated Measurements. London: Chapman and Hall.  (Comprehensive treatment of theory and practice)
  • Conaway, M. (1999, October 11). Repeated Measures Design. Retrieved February 18, 2008, from http://biostat.mc.vanderbilt.edu/twiki/pub/Main/ClinStat/repmeas.PDF
  • Minke, A. (1997, January). Conducting Repeated Measures Analyses: Experimental Design Considerations. Retrieved February 18, 2008, from Ericae.net: http://ericae.net/ft/tamu/Rm.htm
  • Shaughnessy, J. J. (2006). Research Methods in Psychology. New York: McGraw-Hill.

External links [edit]

See also [edit]