Stepped-wedge trial: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
consistent citation formatting; removed redundant URLs from citations containing DOIs
templated citations
Line 8: Line 8:
For instance, suppose a researcher wanted to measure whether teaching college students how to make several meals increased their propensity to cook at home instead of eating out. In a traditional RCT, a sample of students would be selected and half would be trained on how to cook these meals, whereas the other half would not. Both groups would be monitored to see how frequently they ate out. In the end, the number of times the treatment group ate out would be compared to the number of times the control group ate out, most likely with a [[t-test]] or some variant. If, however, the researcher could only train a limited number of students each week, then the researcher could employ an SWT, randomly assigning students to which week they would be trained. In an SWT, a classic t-test is inappropriate, and different statistical methods must be used.
For instance, suppose a researcher wanted to measure whether teaching college students how to make several meals increased their propensity to cook at home instead of eating out. In a traditional RCT, a sample of students would be selected and half would be trained on how to cook these meals, whereas the other half would not. Both groups would be monitored to see how frequently they ate out. In the end, the number of times the treatment group ate out would be compared to the number of times the control group ate out, most likely with a [[t-test]] or some variant. If, however, the researcher could only train a limited number of students each week, then the researcher could employ an SWT, randomly assigning students to which week they would be trained. In an SWT, a classic t-test is inappropriate, and different statistical methods must be used.


The term "stepped wedge" was coined by the Gambia Hepatitis Intervention Study due to the stepped-wedge shape that is apparent from a schematic illustration of the design.<ref>The Gambia Hepatitis Study Group, The Gambia Hepatitis Intervention Study, ''Cancer Research'', 1987;47(21):5782–87.</ref> The crossover is in one direction, typically from control to intervention, with the intervention not removed once implemented. The stepped-wedge design can be used for individually randomised trials,<ref>Ratanawongsa N, Handley MA, Quan J, Sarkar U, Pfeifer K, Soria C, et al, Quasi-experimental trial of diabetes Self-Management Automated and Real-Time Telephonic Support (SMARTSteps) in a Medicaid managed care plan: study protocol., ''BMC health services research'', 2012;12:22.</ref><ref>Lohaugen GCC, Beneventi H, Andersen GL, Sundberg C, Ostgard HF, Bakkan E, et al., Do children with cerebral palsy benefit from computerized working memory training? Study protocol for a randomized controlled trial., ''Trials'', 2014;15(1).</ref> i.e., trials where each individual is treated sequentially, but is more commonly used as a cluster randomised trial (CRT),<ref name="Brown2006">Brown CA, Lilford RJ. The stepped wedge trial design: a systematic review. BMC medical research methodology. 2006;6:54.
The term "stepped wedge" was coined by the Gambia Hepatitis Intervention Study due to the stepped-wedge shape that is apparent from a schematic illustration of the design.<ref name="pmid2822233">{{cite journal | author = The Gambia Hepatitis Study Group | title = The Gambia Hepatitis Intervention Study | journal = Cancer Research | volume = 47 | issue = 21 | pages = 5782–7 | date = November 1987 | pmid = 2822233 | doi = | url = http://cancerres.aacrjournals.org/content/47/21/5782.long }}</ref> The crossover is in one direction, typically from control to intervention, with the intervention not removed once implemented. The stepped-wedge design can be used for individually randomised trials,<ref name="pmid22280514">{{cite journal | vauthors = Ratanawongsa N, Handley MA, Quan J, Sarkar U, Pfeifer K, Soria C, Schillinger D | title = Quasi-experimental trial of diabetes Self-Management Automated and Real-Time Telephonic Support (SMARTSteps) in a Medicaid managed care plan: study protocol | journal = BMC Health Services Research | volume = 12 | issue = | pages = 22 | date = January 2012 | pmid = 22280514 | pmc = 3276419 | doi = 10.1186/1472-6963-12-22 }}</ref><ref name="pmid24998242">{{cite journal | vauthors = Løhaugen GC, Beneventi H, Andersen GL, Sundberg C, Østgård HF, Bakkan E, Walther G, Vik T, Skranes J | title = Do children with cerebral palsy benefit from computerized working memory training? Study protocol for a randomized controlled trial | journal = Trials | volume = 15 | issue = | pages = 269 | date = July 2014 | pmid = 24998242 | pmc = 4226979 | doi = 10.1186/1745-6215-15-269 }}</ref> i.e., trials where each individual is treated sequentially, but is more commonly used as a cluster randomised trial (CRT),<ref name="Brown2006">{{cite journal | vauthors = Brown CA, Lilford RJ | title = The stepped wedge trial design: a systematic review | journal = BMC Medical Research Methodology | volume = 6 | issue = | pages = 54 | date = November 2006 | pmid = 17092344 | pmc = 1636652 | doi = 10.1186/1471-2288-6-54 }}</ref> as in our example of groups of students being assigned to different weeks to be trained.
</ref> as in our example of groups of students being assigned to different weeks to be trained.


== Experiment Design and Data Collection ==
== Experiment Design and Data Collection ==


In the context of a cluster randomised trial, the stepped-wedge design involves the collection of observations during a baseline period in which no clusters are exposed to the intervention. Following this, at regular intervals, or steps, a cluster (or group of clusters) is randomised to receive the intervention<ref name="Brown2006" /><ref name="Mdege2011">Mdege ND, Man MS, Taylor Nee Brown CA, Torgerson DJ. Systematic review of stepped wedge cluster randomized trials shows that design is particularly used to evaluate interventions during routine implementation. Journal of clinical epidemiology. 2011;64(9):936–48.</ref> and ''all'' participants are once again measured.<ref name="Hussey2007">Hussey MA, Hughes JP. Design and analysis of stepped wedge cluster randomized trials. ''Contemp Clin Trials''. 2007;28(2):182–91</ref> This process continues until all clusters have received the intervention. Finally, one more measurement is made after all clusters have received the intervention.
In the context of a cluster randomised trial, the stepped-wedge design involves the collection of observations during a baseline period in which no clusters are exposed to the intervention. Following this, at regular intervals, or steps, a cluster (or group of clusters) is randomised to receive the intervention<ref name="Brown2006" /><ref name="Mdege2011">{{cite journal | vauthors = Mdege ND, Man MS, Taylor Nee Brown CA, Torgerson DJ | title = Systematic review of stepped wedge cluster randomized trials shows that design is particularly used to evaluate interventions during routine implementation | journal = Journal of Clinical Epidemiology | volume = 64 | issue = 9 | pages = 936–48 | date = September 2011 | pmid = 21411284 | doi = 10.1016/j.jclinepi.2010.12.003 }}</ref> and ''all'' participants are once again measured.<ref name="Hussey2007">{{cite journal | vauthors = Hussey MA, Hughes JP | title = Design and analysis of stepped wedge cluster randomized trials | journal = Contemporary Clinical Trials | volume = 28 | issue = 2 | pages = 182–91 | date = February 2007 | pmid = 16829207 | doi = 10.1016/j.cct.2006.05.007 }}</ref> This process continues until all clusters have received the intervention. Finally, one more measurement is made after all clusters have received the intervention.


== When is SWT an appropriate design? ==
== When is SWT an appropriate design? ==
Line 84: Line 83:
SWT may suffer from certain drawbacks. First, since in SWTs the study period lasts longer and all the subjects eventually receive the treatment, costs may increase significantly.<ref name="Woertman2013" /> Because the design can be expensive, SWTs may not be the optimal solution when measurement precision and outcome autocorrelation are high.<ref name=":1" />
SWT may suffer from certain drawbacks. First, since in SWTs the study period lasts longer and all the subjects eventually receive the treatment, costs may increase significantly.<ref name="Woertman2013" /> Because the design can be expensive, SWTs may not be the optimal solution when measurement precision and outcome autocorrelation are high.<ref name=":1" />


Secondly, in an SWT, more clusters are exposed to the intervention at later than earlier time periods. As such, it is possible that an underlying temporal trend may confound the intervention effect, and so the confounding effect of time must be accounted for in both pre-trial power calculations and post-trial analysis.<ref name="Brown2006" /><ref>Van den Heuvel ER, Zwanenburg RJ, Van Ravenswaaij-Arts CM. A stepped wedge design for testing an effect of intranasal insulin on cognitive development of children with Phelan-McDermid syndrome: A comparison of different designs. Statistical methods in medical research. 2014.</ref><ref name=":0" /> Specifically, in post-trial analysis, the use of [[Generalized linear mixed model|generalized linear mixed models]] or [[Generalized estimating equation|generalized estimating equations]] is recommended.<ref name="Woertman2013" />
Secondly, in an SWT, more clusters are exposed to the intervention at later than earlier time periods. As such, it is possible that an underlying temporal trend may confound the intervention effect, and so the confounding effect of time must be accounted for in both pre-trial power calculations and post-trial analysis.<ref name="Brown2006" /><ref name="pmid25411323">{{cite journal | vauthors = Van den Heuvel ER, Zwanenburg RJ, Van Ravenswaaij-Arts CM | title = A stepped wedge design for testing an effect of intranasal insulin on cognitive development of children with Phelan-McDermid syndrome: A comparison of different designs | journal = Statistical Methods in Medical Research | volume = 26 | issue = 2 | pages = 766–775 | date = April 2017 | pmid = 25411323 | doi = 10.1177/0962280214558864 }}</ref><ref name=":0" /> Specifically, in post-trial analysis, the use of [[Generalized linear mixed model|generalized linear mixed models]] or [[Generalized estimating equation|generalized estimating equations]] is recommended.<ref name="Woertman2013" />


Finally, the design and analysis of stepped-wedge trials is therefore more complex than for other types of randomized trials. Previous [[systematic reviews]] highlighted the poor reporting of sample size calculations and a lack of consistency in the analysis of such trials.<ref name="Brown2006" /><ref name="Mdege2011" /> Hussey and Hughes were the first authors to suggest a structure and formula for estimating power in stepped-wedge studies in which data was collected at each and every step.<ref name="Hussey2007" /> This has now been expanded for designs in which observations are not made at each step as well as multiple layers of clustering.<ref>Hemming K, Lilford R, Girling AJ. Stepped-wedge cluster randomised controlled trials: a generic framework including parallel and multiple-level designs. Statistics in Medicine. 2015;34(2):181–96.</ref> Additionally, a [[design effect]] (used to inflate the sample size of an individually randomized trial to that required in a cluster trial) has been established,<ref name="Woertman2013">Woertman W, de Hoop E, Moerbeek M, Zuidema SU, Gerritsen DL, Teerenstra S. Stepped wedge designs could reduce the required sample size in cluster randomized trials. Journal of clinical epidemiology. 2013;66(7):752–58.</ref> which has shown that the stepped wedge CRT could reduce the number of patients required in the trial compared to other designs.<ref name="Woertman2013" /><ref>Keriel-Gascou M, Buchet-Poyau K, Rabilloud M, Duclos A, Colin C. A stepped wedge cluster randomized trial is preferable for assessing complex health interventions. Journal of clinical epidemiology. 2014;67(7):831–33.
Finally, the design and analysis of stepped-wedge trials is therefore more complex than for other types of randomized trials. Previous [[systematic reviews]] highlighted the poor reporting of sample size calculations and a lack of consistency in the analysis of such trials.<ref name="Brown2006" /><ref name="Mdege2011" /> Hussey and Hughes were the first authors to suggest a structure and formula for estimating power in stepped-wedge studies in which data was collected at each and every step.<ref name="Hussey2007" /> This has now been expanded for designs in which observations are not made at each step as well as multiple layers of clustering.<ref>Hemming K, Lilford R, Girling AJ. Stepped-wedge cluster randomised controlled trials: a generic framework including parallel and multiple-level designs. Statistics in Medicine. 2015;34(2):181–96.</ref> Additionally, a [[design effect]] (used to inflate the sample size of an individually randomized trial to that required in a cluster trial) has been established,<ref name="Woertman2013">Woertman W, de Hoop E, Moerbeek M, Zuidema SU, Gerritsen DL, Teerenstra S. Stepped wedge designs could reduce the required sample size in cluster randomized trials. Journal of clinical epidemiology. 2013;66(7):752–58.</ref> which has shown that the stepped wedge CRT could reduce the number of patients required in the trial compared to other designs.<ref name="Woertman2013" /><ref>Keriel-Gascou M, Buchet-Poyau K, Rabilloud M, Duclos A, Colin C. A stepped wedge cluster randomized trial is preferable for assessing complex health interventions. Journal of clinical epidemiology. 2014;67(7):831–33.

Revision as of 03:05, 27 November 2018

A stepped-wedge trial (or SWT) is a type of randomised controlled trial (or RCT), a scientific experiment which is structured to reduce bias when testing new medical treatments, social interventions, or other testable hypotheses. In a traditional RCT, half of the participants in the experiment are simultaneously and randomly assigned to a group that receives the treatment (the "treatment group") and half to a group that does not (the "control group"). In an SWT, typically a logistical constraint prevents the simultaneous treatment of half of the participants, and instead, participants receive the treatment in "waves" or "clusters."

For instance, suppose a researcher wanted to measure whether teaching college students how to make several meals increased their propensity to cook at home instead of eating out. In a traditional RCT, a sample of students would be selected and half would be trained on how to cook these meals, whereas the other half would not. Both groups would be monitored to see how frequently they ate out. In the end, the number of times the treatment group ate out would be compared to the number of times the control group ate out, most likely with a t-test or some variant. If, however, the researcher could only train a limited number of students each week, then the researcher could employ an SWT, randomly assigning students to which week they would be trained. In an SWT, a classic t-test is inappropriate, and different statistical methods must be used.

The term "stepped wedge" was coined by the Gambia Hepatitis Intervention Study due to the stepped-wedge shape that is apparent from a schematic illustration of the design.[1] The crossover is in one direction, typically from control to intervention, with the intervention not removed once implemented. The stepped-wedge design can be used for individually randomised trials,[2][3] i.e., trials where each individual is treated sequentially, but is more commonly used as a cluster randomised trial (CRT),[4] as in our example of groups of students being assigned to different weeks to be trained.

Experiment Design and Data Collection

In the context of a cluster randomised trial, the stepped-wedge design involves the collection of observations during a baseline period in which no clusters are exposed to the intervention. Following this, at regular intervals, or steps, a cluster (or group of clusters) is randomised to receive the intervention[4][5] and all participants are once again measured.[6] This process continues until all clusters have received the intervention. Finally, one more measurement is made after all clusters have received the intervention.

When is SWT an appropriate design?

Hargreaves and colleagues offer a series of five questions that researchers should answer to decide whether SWT is indeed the optimal design, and how to proceed in every step of the study.[7] Specifically, researchers should be able to identify:

  1. The reasons SWT is the preferred design: If the measuring a treatment effect is the primary goal of research, SWT may not be the optimal design. SWTs are appropriate when the research focus is on the effectiveness of the treatment effect rather than on its mere existence. If the study is explanatory (i.e. seeks to study the cause of an effect), the benefits are significant but so are the challenges. Repeated interventions and the corollary training workload of canvassers over time, minimizing attrition, and ensuring compliance and ignorability can increase costs, and undermine unbiasedness and efficiency. Moreover, dealing with ethical issues related to postponing the intervention for some clusters is also crucial. In overall, logistical and other practical concerns are considered to be the best reasons to turn to a stepped wedge design.
  2. Which SWT design is more suitable: SWTs can feature three main designs employing a closed cohort, an open cohort, and a continuous recruitment with short exposure.[8] Typically, in the first design all subjects participate from the beginning of the experiment and until its completion, and the outcomes are measured repeatedly at fixed time points which may or may not be related to each step. In the open cohort design, only a part of the subjects are exposed from the start, and more are gradually exposed in subsequent steps. Thus, the time of exposure varies for each subject. The outcomes are measured similarly to the former design, but new subjects can enter the study, and some participants from an early stage can leave before the completion. In continuous recruitment with short exposure
, very few or no subjects participate in the beginning of the experiment but more become eligible, and are exposed to short intervention gradually. In this design, each subject is assigned to either the treatment or the control condition. This feature minimizes the risk of carry-over effects that may be a challenge for closed and open cohort designs.
  3. Which analysis strategy is appropriate : Linear Mixed Models (LMM), Generalized Linear Mixed Models (GLMM), and Generalized Estimating Equations (GEE) are the principal estimators that are recommended for analyzing the results. While LMM offers higher power than GLMM and GEE, it can be inefficient if the size of clusters vary, and the response is continuous and normally distributed. If any of those assumptions is violated, GLMM and GEE are preferred.
  4. How big the sample should be: Power analysis and sample size calculation are available. Generally, SWTs require smaller sample size to detect effects since they leverage both between and within-cluster comparisons.[9][10]
  5. Best practices for reporting the design and results of the trial
: Reporting the design, sample profile, and results can be challenging, since no Consolidated Standards Of Reporting Trials (CONSORT) have been designated for SWTs. Though some studies have provided both formalizations and flow charts that help sustaining a balanced sample, and reporting results.[11]

Model

While there are several other potential methods for modeling outcomes in an SWT,[12] the work of Hussey and Hughes[6] "first described methods to determine statistical power available when using a stepped wedge design."[12] What follows is their design.

Suppose there are samples divided into clusters. At each time point , preferably equally spaced in actual time, some number of clusters are treated. Let be if cluster has been treated at time and otherwise. In particular, note that if then .

For each participant in cluster , measure the outcome to be studied at time . We model these outcomes as

where:

  • is a grand mean,
  • is a random, cluster-level effect on the outcome,
  • is a time point-specific fixed effect,
  • is the measured effect of the treatment, and
  • is the residual noise.

This model can be viewed as a Hierarchical linear model where at the lowest level where is the mean of a given cluster at a given time, and at the cluster level, each cluster mean .

Design Effect and Sample Size Calculation

The design effect of stepped wedge design given by the formula:[9]

where:

  • ρ is the intra-cluster correlation (ICC),
  • n is the number of subjects within a cluster,
  • k is the number of steps,
  • t is the number of measurements after each step, and
  • b is the number of baseline measurements.

To calculate the sample size it is needed to apply the simple formula:[9]

where:

  • Nsw is the required sample size for the SWT
  • Nu is the total unadjusted sample size that would be required for a traditional RCT.

Note that increasing either k, t, or b will result to decreasing the required sample size for an SWT.

Further, the required cluster c size is given by:[9]

To calculate how many clusters cs need to switch from the control to the treatment condition, the following formula is available:[9]

If c and cs are not integers they need to be rounded to the next larger integer, and distributed as evenly as possible among k.

Advantages

Stepped wedge design features many comparative advantages to traditional RCTs. First, SWTs are most appropriate both ethically and practically when the intervention is expected to produce a positive outcome. Since all subjects will eventually receive the benefits of the intervention, ethical concerns can be appeased, and the recruitment of participants may become easier.[9] Secondly, SWTs "can reconcile the need for robust evaluations with political or logistical constraints."[12] Specifically, when resources for performing an intervention are scarce, it can be used to measure the effects of treatment even when a treatment and control group cannot be compared simultaneously.

Thirdly, since each cluster receives both the control and the treatment condition by the end of the trial, both between and within-cluster comparisons are possible. This way statistical power increases while keeping the sample significantly smaller than it would be needed in a traditional RCT.[9] Finally, because each cluster switches randomly from control to treatment condition in different time points, it is possible to examine time effects.[9] For example, it is possible to study how repeated or long-term exposure to experimental stimuli affects the efficiency of the treatment. Repeated measurements in regular time frames can average the noise out, which in turn increases the precision of estimates. This advantage becomes most apparent when measurement is noisy, and outcome autocorrelation is low.[13]

Disadvantages

SWT may suffer from certain drawbacks. First, since in SWTs the study period lasts longer and all the subjects eventually receive the treatment, costs may increase significantly.[9] Because the design can be expensive, SWTs may not be the optimal solution when measurement precision and outcome autocorrelation are high.[13]

Secondly, in an SWT, more clusters are exposed to the intervention at later than earlier time periods. As such, it is possible that an underlying temporal trend may confound the intervention effect, and so the confounding effect of time must be accounted for in both pre-trial power calculations and post-trial analysis.[4][14][12] Specifically, in post-trial analysis, the use of generalized linear mixed models or generalized estimating equations is recommended.[9]

Finally, the design and analysis of stepped-wedge trials is therefore more complex than for other types of randomized trials. Previous systematic reviews highlighted the poor reporting of sample size calculations and a lack of consistency in the analysis of such trials.[4][5] Hussey and Hughes were the first authors to suggest a structure and formula for estimating power in stepped-wedge studies in which data was collected at each and every step.[6] This has now been expanded for designs in which observations are not made at each step as well as multiple layers of clustering.[15] Additionally, a design effect (used to inflate the sample size of an individually randomized trial to that required in a cluster trial) has been established,[9] which has shown that the stepped wedge CRT could reduce the number of patients required in the trial compared to other designs.[9][16]

Ongoing Work

The number of studies using the design have been on the increase. In 2015, a thematic series was published in the journal Trials.[17] In 2016, the first international conference dedicated to the topic was held at the University of York.[18][19]

References

  1. ^ The Gambia Hepatitis Study Group (November 1987). "The Gambia Hepatitis Intervention Study". Cancer Research. 47 (21): 5782–7. PMID 2822233.
  2. ^ Ratanawongsa N, Handley MA, Quan J, Sarkar U, Pfeifer K, Soria C, Schillinger D (January 2012). "Quasi-experimental trial of diabetes Self-Management Automated and Real-Time Telephonic Support (SMARTSteps) in a Medicaid managed care plan: study protocol". BMC Health Services Research. 12: 22. doi:10.1186/1472-6963-12-22. PMC 3276419. PMID 22280514.{{cite journal}}: CS1 maint: unflagged free DOI (link)
  3. ^ Løhaugen GC, Beneventi H, Andersen GL, Sundberg C, Østgård HF, Bakkan E, Walther G, Vik T, Skranes J (July 2014). "Do children with cerebral palsy benefit from computerized working memory training? Study protocol for a randomized controlled trial". Trials. 15: 269. doi:10.1186/1745-6215-15-269. PMC 4226979. PMID 24998242.{{cite journal}}: CS1 maint: unflagged free DOI (link)
  4. ^ a b c d Brown CA, Lilford RJ (November 2006). "The stepped wedge trial design: a systematic review". BMC Medical Research Methodology. 6: 54. doi:10.1186/1471-2288-6-54. PMC 1636652. PMID 17092344.{{cite journal}}: CS1 maint: unflagged free DOI (link)
  5. ^ a b Mdege ND, Man MS, Taylor Nee Brown CA, Torgerson DJ (September 2011). "Systematic review of stepped wedge cluster randomized trials shows that design is particularly used to evaluate interventions during routine implementation". Journal of Clinical Epidemiology. 64 (9): 936–48. doi:10.1016/j.jclinepi.2010.12.003. PMID 21411284.
  6. ^ a b c Hussey MA, Hughes JP (February 2007). "Design and analysis of stepped wedge cluster randomized trials". Contemporary Clinical Trials. 28 (2): 182–91. doi:10.1016/j.cct.2006.05.007. PMID 16829207.
  7. ^ Hargreaves JR, Copas AJ, Beard E, Osrin D, Lewis JJ, Davey C, Thompson JA, Baio G, Fielding KL, Prost A (August 2015). "Five questions to consider before conducting a stepped wedge trial". Trials. 16 (1): 350. doi:10.1186/s13063-015-0841-8. PMC 4538743. PMID 26279013.{{cite journal}}: CS1 maint: unflagged free DOI (link)
  8. ^ Copas AJ, Lewis JJ, Thompson JA, Davey C, Baio G, Hargreaves JR (August 2015). "Designing a stepped wedge trial: three main designs, carry-over effects and randomisation approaches". Trials. 16 (1): 352. doi:10.1186/s13063-015-0842-7. PMC 4538756. PMID 26279154.{{cite journal}}: CS1 maint: unflagged free DOI (link)
  9. ^ a b c d e f g h i j k l Woertman W, de Hoop E, Moerbeek M, Zuidema SU, Gerritsen DL, Teerenstra S. Stepped wedge designs could reduce the required sample size in cluster randomized trials. Journal of clinical epidemiology. 2013;66(7):752–58.
  10. ^ Baio G, Copas A, Ambler G, Hargreaves J, Beard E, Omar RZ (August 2015). "Sample size calculation for a stepped wedge trial". Trials. 16 (1): 354. doi:10.1186/s13063-015-0840-9. PMC 4538764. PMID 26282553.{{cite journal}}: CS1 maint: unflagged free DOI (link)
  11. ^ Gruber JS, Reygadas F, Arnold BF, Ray I, Nelson K, Colford JM (August 2013). "A stepped wedge, cluster-randomized trial of a household UV-disinfection and safe storage drinking water intervention in rural Baja California Sur, Mexico". The American Journal of Tropical Medicine and Hygiene. 89 (2): 238–45. doi:10.4269/ajtmh.13-0017. PMC 3741243. PMID 23732255.
  12. ^ a b c d Hemming K, Haines TP, Chilton PJ, Girling AJ, Lilford RJ (February 2015). "The stepped wedge cluster randomised trial: rationale, design, analysis, and reporting". Bmj. 350: h391. doi:10.1136/bmj.h391. PMID 25662947.
  13. ^ a b McKenzie D (November 2012). "Beyond baseline and follow-up: The case for more T in experiments Author links open overlay panel". Journal of Development Economics. 99 (2): 210–221.
  14. ^ Van den Heuvel ER, Zwanenburg RJ, Van Ravenswaaij-Arts CM (April 2017). "A stepped wedge design for testing an effect of intranasal insulin on cognitive development of children with Phelan-McDermid syndrome: A comparison of different designs". Statistical Methods in Medical Research. 26 (2): 766–775. doi:10.1177/0962280214558864. PMID 25411323.
  15. ^ Hemming K, Lilford R, Girling AJ. Stepped-wedge cluster randomised controlled trials: a generic framework including parallel and multiple-level designs. Statistics in Medicine. 2015;34(2):181–96.
  16. ^ Keriel-Gascou M, Buchet-Poyau K, Rabilloud M, Duclos A, Colin C. A stepped wedge cluster randomized trial is preferable for assessing complex health interventions. Journal of clinical epidemiology. 2014;67(7):831–33.
  17. ^ Torgerson, David (2015). "Stepped Wedge Randomized Controlled Trials". Trials. 16: 350. Retrieved 17 February 2017. {{cite journal}}: Unknown parameter |name-list-format= ignored (|name-list-style= suggested) (help)
  18. ^ "University of York".
  19. ^ Kanaan, Mona; Keding, Ada; Mdege, Noreen; Torgerson, David (2016). "Proceedings of the First International Conference on Stepped Wedge Trial Design". Trials. 17(Suppl 1): 311. Retrieved 17 February 2017. {{cite journal}}: Unknown parameter |name-list-format= ignored (|name-list-style= suggested) (help)