Multiple baseline design
A multiple baseline design is a style of research involving the careful measurement of multiple persons, traits or settings both before and after a treatment. This design is used in medical, psychological and biological research to name a few areas. It has several advantages over AB designs which only measures a single case. It is important to note that the start of treatment conditions is staggered (started at different times) across individuals. Because treatment is started at different times we can conclude that changes are due to the treatment rather than to a chance factor. By gathering data from many subjects (instances), inferences can be made about the likeliness that the measured trait generalizes to a greater population. In multiple baseline designs, the experimenter starts by measuring a trait of interest, then applying a treatment before measuring that trait again. Treatment should not begin until a stable baseline has been recorded, and should not finish until measures regain stability. If a significant change occurs across all participants the experimenter may infer that the treatment is effective.
Multiple base-line experiments are most commonly used in cases where the dependent variable is not expected to return to normal after the treatment has been applied, or when medical reasons forbid the withdrawal of a treatment. They often employ particular methods or recruiting participants. Multiple base-line designs are associated with potential confounds introduced by an experimenter bias which must be addressed in order to preserve objectivity. Particularly, researchers are advised to develop all test schedules and data collection limits beforehand.
Although multiple baseline designs may employ any method of recruitment, it is often associated with "ex post facto" recruitment. This is because multiple baselines can provide data regarding the consensus of a treatment response. Such data can often not be gathered from ABA (reversal) designs for ethical or learning reasons. Experimenters are advised not to remove cases that do not exactly fit their criteria, as this may introduce sampling bias and threaten validity. Ex post facto recruitment methods are not considered true experiments, due to the limits of experimental control or randomized control that the experimenter has over the trait. This is because a control group may necessarily be selected from a discrete separate population. This research design is thus considered a quasi-experimental design.
Multiple baseline studies are often categorized as either concurrent or nonconcurrent. Concurrent designs are the traditional approach to multiple baseline studies, where all participants undergo treatment simultaneously. This strategy is advantageous because it moderates several threats to validity, and history effects in particular. Concurrent multiple baseline designs are also useful for saving time, since all participants are processed at once. The ability to retrieve complete data sets within well defined time constraints is a valuable asset while planning research.
Nonconcurrent multiple baseline studies apply treatment to several individuals at delayed intervals. This has the advantage of greater flexibility in recruitment of participants and testing location. For this reason, perhaps, nonconcurrent multiple baseline experiments are recommended for research in an educational setting. It is recommended that the experimenter selects time frames beforehand to avoid experimenter bias, but even when methods are used to improve validity, inferences may be weakened. Currently, there is debate as to whether nonconcurrent studies represent a real threat from history effects. It is generally agreed, however, that concurrent testing is more stable.
Although multiple baseline experimental designs compensate for many of the issues inherent in ex post facto recruitment, experimental manipulation of a trait gathered by this method may not be manipulated. Thus these studies are prevented from inferring causation.
Managing threats to validity
A priori (beforehand) specification of the hypothesis, time frames, and data limits help control threats due to experimenter bias. For the same reason researchers should avoid removing participants based on merit. Multiple probe designs may be useful in identifying extraneous factors which may be influencing your results. Lastly, experimenters should avoid gathering data during sessions alone. If in-session data is gathered a note of the dates should be tagged to each measurement in order to provide an accurate time-line for potential reviewers. This data may represent unnatural behaviour or states of mind, and must be considered carefully during interpretation.
- Christ, T. (2007). Experimental control and threats to internal validity of concurrent and nonconcurrent multiple baseline designs. Psychology in the Schools, 44(5), 451-459. doi:10.1002/pits.20237.
- Recommendations for Reporting Multiple-Baseline Designs across Participants. Behavioral Interventions, 20(3), 219-224. doi:10.1002/bin.191.
- Harvey, M., May, M., & Kennedy, C. (2004). Nonconcurrent Multiple Baseline Designs and the Evaluation of Educational Systems. Journal of Behavioral Education, 13(4), 267-276. doi:10.1023/B:JOBE.0000044735.51022.5d.
- Harris, F., & Jenson, W. (1985). Comparisons of multiple-baseline across persons designs and AB designs with replication: Issues and confusions. Behavioral Assessment, 7(2), 121-127. doi:10.1007/BF00961078.
- Watson, P., & Workman, E. (1981). The non-concurrent multiple baseline across-individuals design: An extension of the traditional multiple baseline design. Journal of Behavior Therapy and Experimental Psychiatry, 12(3), 257-259. doi:10.1016/0005-7916(81)90055-0.