A recent study used an ABA reversal SC study to determine the effectiveness of core stability training in 8 participants with multiple sclerosis.16 During the first 4 weekly data collections, the researchers ensured a stable baseline, which was followed by 8 weekly intervention data points, and concluded with 4 weekly withdrawal data points. Intervention significantly improved participants' walking and reaching performance (Figure 2).16 This A1BA2 design could have been strengthened by the addition of a second intervention phase for replication (A1B1A2B2). For instance, a single-case A1B1A2B2 withdrawal design aimed to assess the efficacy of rehabilitation using visuo-spatio-motor cueing for 2 participants with severe unilateral neglect after a severe right hemisphere stroke.17 Each phase included 8 data points. Statistically significant intervention-related improvement was observed, suggesting that visuo-spatio-motor cueing might be promising for treating individuals with very severe neglect (Figure 3).17
The reversal design can also incorporate a cross-over design where each participant experiences more than 1 type of intervention. For instance, a B1C1B2C2 design could be used to study the effects of 2 different interventions (B and C) on outcome measures. Challenges with including more than 1 intervention involve potential carryover effects from earlier interventions and order effects that may impact the measured effectiveness of the interventions.2,12 Including multiple participants and randomizing the order of intervention phase presentations are tools to help control for these types of effects.19
When an intervention permanently changes an individual's ability, a return-to-baseline performance is not feasible and reversal designs are not appropriate. Multiple baseline designs (MBDs) are useful in these situations (Figure 4).20 Multiple baseline designs feature staggered introduction of the intervention across time: each participant is randomly assigned to 1 of at least 3 experimental conditions characterized by the length of the baseline phase.21 These studies involve more than 1 participant, thus functioning as SC studies with replication across participants. Staggered introduction of the intervention allows for separation of intervention effects from those of maturation, experience, learning, and practice. For example, a multiple baseline SC study was used to investigate the effect of an antispasticity baclofen medication on stiffness in 5 adult males with spinal cord injury.20 The subjects were randomly assigned to receive 5 to 9 baseline data points with a placebo treatment before the initiation of the intervention phase with the medication. Both participants and assessors were blind to the experimental condition. The results suggested that baclofen might not be a universal treatment choice for all individuals with spasticity resulting from a traumatic spinal cord injury (Figure 4).20
The impact of 2 or more interventions can also be assessed via alternating treatment designs (ATDs). In ATDs, after establishing the baseline, the experimenter exposes subjects to different intervention conditions administered in close proximity for equal intervals (Figure 5).22 ATDs are prone to “carryover effects” when the effects of 1 intervention influence the observed outcomes of another intervention.1 As a result, such designs introduce unique challenges when attempting to determine the effects of any 1 intervention and have been less commonly utilized in rehabilitation. An ATD was used to monitor disruptive behaviors in the school setting throughout a baseline followed by an alternating treatment phase with randomized presentation of a control condition or an exercise condition.23 Results showed that 30 minutes of moderate to intense physical activity decreased behavioral disruptions through 90 minutes after the intervention.23 An ATD was also used to compare the effects of commercially available and custom-made video prompts on the performance of multistep cooking tasks in 4 participants with autism.22 Results showed that participants independently performed more steps with the custom-made video prompts (Figure 5).22
Regardless of the SC study design, replication and randomization should be incorporated when possible to improve internal and external validity.11 The reversal design is an example of replication across study phases. The minimum number of phase replications needed to meet quality standards is 3 (A1B1A2B2), but having 4 or more replications is highly recommended (A1B1A2B2A3...).11,14 In cases when interventions aim to produce lasting changes in participants' abilities, replication of findings may be demonstrated by replicating intervention effects across multiple participants (as in multiple-participant AB designs), or across multiple settings, tasks, or service providers. When the results of an intervention are replicated across multiple reversals, participants, and/or contexts, there is an increased likelihood that a causal relationship exists between the intervention and the outcome.2,12
Randomization should be incorporated in SC studies to improve internal validity and the ability to assess for causal relationships among interventions and outcomes.11 In contrast to traditional group designs, SC studies often do not have multiple participants or units that can be randomly assigned to different intervention conditions. Instead, in randomized phase-order designs, the sequence of phases is randomized. Simple or block randomization is possible. For example, with simple randomization for an A1B1A2B2 design, the A and B conditions are treated as separate units and are randomly assigned to be administered for each of the predefined data collection points. As a result, any combination of A-B sequences is possible without restrictions on the number of times each condition is administered or regard for repetitions of conditions (eg, A1B1B2A2B3B4B5A3B6A4A5A6). With block randomization for an A1B1A2B2 design, 2 conditions (eg, A and B) would be blocked into a single unit (AB or BA), randomization of which to different periods would ensure that each condition appears in the resulting sequence more than 2 times (eg, A1B1B2A2A3B3A4B4). Note that AB and reversal designs require that the baseline (A) always precedes the first intervention (B), which should be accounted for in the randomization scheme.2,11
In randomized phase start-point designs, the lengths of the A and B phases can be randomized.2,11,24–26 For example, for an AB design, researchers could specify the number of time points at which outcome data will be collected (eg, 20), define the minimum number of data points desired in each phase (eg, 4 for A, 3 for B), and then randomize the initiation of the intervention so that it occurs anywhere between the remaining time points (points 5 and 17 in the current example).27,28 For multiple baseline designs, a dual-randomization or “regulated randomization” procedure has been recommended.29 If multiple baseline randomization depends solely on chance, it could be the case that all units are assigned to begin intervention at points not really separated in time.30 Such randomly selected initiation of the intervention would result in the drastic reduction of the discriminant and internal validity of the study.29 To eliminate this issue, investigators should first specify appropriate intervals between the start points for different units, then randomly select from those intervals, and finally randomly assign each unit to a start point.29
SINGLE-CASE ANALYSIS TECHNIQUES FOR INTERVENTION RESEARCH
The What Works Clearinghouse (WWC) single-case design technical documentation provides an excellent overview of appropriate SC study analysis techniques to evaluate the effectiveness of intervention effects.1,18 First, visual analyses are recommended to determine whether there is a functional relationship between the intervention and the outcome. Second, if evidence for a functional effect is present, the visual analysis is supplemented with quantitative analysis methods evaluating the magnitude of the intervention effect. Third, effect sizes are combined across cases to estimate overall average intervention effects, which contribute to evidence-based practice, theory, and future applications.2,18
Traditionally, SC study data are presented graphically. When more than 1 participant engages in a study, a spaghetti plot showing all of their data in the same figure can be helpful for visualization. Visual analysis of graphed data has been the traditional method for evaluating treatment effects in SC research.1,12,31,32 The visual analysis involves evaluating level, trend, and stability of the data within each phase (ie, within-phase data examination) followed by examination of the immediacy of effect, consistency of data patterns, and overlap of data between baseline and intervention phases (ie, between-phase comparisons). When the changes (and/or variability) in level are in the desired direction, are immediate, readily discernible, and maintained over time, it is concluded that the changes in behavior across phases result from the implemented treatment and are indicative of improvement.33 Three demonstrations of an intervention effect are necessary for establishing a functional relationship.1
Level, trend, and stability of the data within each phase are evaluated. Mean and/or median can be used to report the level, and trend can be evaluated by determining whether the data points are monotonically increasing or decreasing. Within-phase stability can be evaluated by calculating the percentage of data points within 15% of the phase median (or mean). The stability criterion is satisfied if about 85% (80%–90%) of the data in a phase fall within a 15% range of the median (or average) of all data points for that phase.34
Immediacy of effect, consistency of data patterns, and overlap of data between baseline and intervention phases are evaluated next. For this, several nonoverlap indices have been proposed that all quantify the proportion of measurements in the intervention phase not overlapping with the baseline measurements.35 Nonoverlap statistics are typically scaled as percent from 0 to 100, or as a proportion from 0 to 1. Here, we briefly discuss the nonoverlap of all pairs (NAP),36the extended celeration line (ECL), the improvement rate difference (IRD),37 and the TauU, and the TauU-adjusted, TauUadj,35 as these are the most recent and complete techniques. We also examine the percentage of nonoverlapping data (PND)38 and the two standard deviations band method, as these are frequently used techniques. In addition, we include the percentage of nonoverlapping corrected data (PNCD)–-an index applying to the PND after controlling for baseline trend.39
Nonoverlap of All Pairs
Each baseline observation can be paired with each intervention phase observation to make n pairs (ie, N = nA × nB). Count the number of overlapping pairs, n0, counting all ties as 0.5. Then define the percent of the pairs that show no overlap. Alternatively, one can count the number of positive (P), negative (N), and tied (T) pairs2,36:
Extended Celeration Line
ECL or split-middle line allows control for a positive phase A trend. Nonoverlap is defined as the proportion of phase B (nb) data that are above the median trend plotted from phase A data (nBAbove Median trend A), but then extended into phase B:
As a consequence, this method depends on a straight line and makes an assumption of linearity in the baseline.2,12
Improvement Rate Difference
This analysis is conceptualized as the difference in improvement rates (IR) between baseline (IRB) and intervention phases (IRT).38 The IR for each phase is defined as the number of “improved data points” divided by the total data points in that phase. Improvement rate difference, commonly employed in medical group research under the name of “risk reduction” or “risk difference,” attempts to provide an intuitive interpretation for nonoverlap and to make use of an established, respected effect size, IRB − IRT, or the difference between 2 proportions.37
TauU and TauUadj
Each baseline observation can be paired with each intervention phase observation to make n pairs (ie, n = nA × nB). Count the number of positive (P), negative (N), and tied (T) pairs, and use the following formula:
The TauUadj is an adjustment of TauU for monotonic trend in baseline. Each baseline observation can be paired with each intervention phase observation to make n pairs (ie, n = nA × nB). Each baseline observation can be paired with all later baseline observations (nA × (nA − 1)/2).2,35 Then the baseline trend can be computed:
; Strend = PA − NA.
Online calculators might assist researchers in obtaining the TauU and TauU adjusted coefficients (http://www.singlecaseresearch.org/calculators/tau-u).
Percentage of Nonoverlapping Data
If anticipating an increase in the outcome, locate the highest data point in the baseline phase and then calculate the percent of the intervention phase data points that exceed it. If anticipating a decrease in the outcome, find the lowest data point in the baseline phase and then calculate the percent of the treatment phase data points that are below it:
. A PND less than 50 would mark no observed effect, PND = 50 to 70 signifies a questionable effect, and PND more than 70 suggests the intervention was effective.40 The percentage of nonoverlapping corrected was proposed in 2009 as an extension of the PND.39 Before applying the PND, a data correction procedure is applied, eliminating preexisting baseline trend.38
Two Standard Deviation Band Method
When the stability criterion described earlier is met within phases, it is possible to apply the 2-standard deviation band method.12,41 First, the mean of the data for a specific condition is calculated and represented with a solid line. In the next step, the standard deviation of the same data is computed, and 2 dashed lines are represented: one located 2 standard deviations above the mean and the other 2 standard deviations below. For normally distributed data, few points (<5%) are expected to be outside the 2-standard deviation bands if there is no change in the outcome score because of the intervention. However, this method is not considered a formal statistical procedure, as the data cannot typically be assumed to be normal, continuous, or independent.41
If the visual analysis indicates a functional relationship (ie, 3 demonstrations of the effectiveness of the intervention effect), it is recommended to proceed with the quantitative analyses, reflecting the magnitude of the intervention effect. First, effect sizes are calculated for each participant (individual-level analysis). Moreover, if the research interest lies in the generalizability of the effect size across participants, effect sizes can be combined across cases to achieve an overall average effect size estimate (across-case effect size).
Note that quantitative analysis methods are still being developed in the domain of SC research1 and statistical challenges of producing an acceptable measure of treatment effect remain.14,42,43 Therefore, the WWC standards strongly recommend conducting sensitivity analysis and reporting multiple effect size estimators. If consistency across different effect size estimators is identified, there is stronger evidence for the effectiveness of the treatment.1,18
Individual-Level Effect Size Analysis
The most common effect sizes recommended for SC analysis are (1) standardized mean difference Cohen's d; (2) standardized mean difference with correction for small sample sizes Hedges' g; and (3) the regression-based approach, which has the most potential and is strongly recommended by the WWC standards.1,44,45Cohen's d can be calculated using following formula:
being the baseline mean,
being the treatment mean, and sp indicating the pooled within-case standard deviation. Hedges' g is an extension of Cohen's d, recommended in the context of SC studies, as it corrects for small sample sizes. The piecewise regression-based approach does not reflect only the immediate intervention effect but also the intervention effect across time:
i stands for the measurement occasion (i = 0, 1, ..., I), ρ indicates the autocorrelation parameter. If ρ is positive, the errors closer in time are more similar; if ρ is negative, the errors closer in time are more different, and if ρ equals 0, there is no correlation between the errors. The dependent variable is regressed on a time indicator, T, which is centered around the first observation of the intervention phase, D, a dummy variable for the intervention phase, and an interaction term of these variables. The equation shows that the expected score,
, equals β0 + β1Ti in the baseline phase, and (β0 + β2) + (β1 + β3)Ti in the intervention phase. β0, therefore, indicates the expected baseline level at the start of the intervention phase (when T = 0), whereas β1 marks the linear time trend in the baseline scores. The coefficient β2 can then be interpreted as an immediate effect of the intervention on the outcome, whereas β3 signifies the effect of the intervention across time. The ei's are residuals assumed to be normally distributed around a mean of 0 with a variance of σ2e. The assumption of independence of errors is usually not met in the context of SC studies because repeated measures are obtained within a person. As a consequence, it can be the case that the residuals are autocorrelated, meaning that errors closer in time are more related to each other compared with errors further away in time.46–48 As a consequence, a lag-1 autocorrelation is appropriate (taking into account the correlation between 2 consecutive errors: ei and ei−1; for more details see Verbeke and Molenberghs.49
Across-Case Effect Sizes
Two-level modeling to estimate the intervention effects across cases can be used to evaluate across-case effect sizes.44,45,50 Multilevel modeling is recommended by the WWC standards because it takes the hierarchical nature of SC studies into account: measurements are nested within cases and cases, in turn, are nested within studies. By conducting a multilevel analysis, important research questions can be addressed (which cannot be answered by single-level analysis of SC study data), such as (1) What is the magnitude of the average treatment effect across cases? (2) What is the magnitude and direction of the case-specific intervention effect? (3) How much does the treatment effect vary within cases and across cases? (4) Does a case and/or study-level predictor influence the treatment's effect? The 2-level model has been validated in previous research using extensive simulation studies.45,46,51 The 2-level model appears to have sufficient power (>0.80) to detect large treatment effects in at least 6 participants with 6 measurements.21
Furthermore, to estimate the across-case effect sizes, the HPS (Hedges, Pustejovsky, and Shadish), or single-case educational design (SCEdD)-specific mean difference, index can be calculated.52 This is a standardized mean difference index specifically designed for SCEdD data, with the aim of making it comparable to Cohen's d of group-comparison designs. The standard deviation takes into account both within-participant and between-participant variability, and is typically used to get an across-case estimator for a standardized change in level. The advantage of using the HPS across-case effect size estimator is that it is directly comparable with Cohen's d for group comparison research, thus enabling the use of Cohen's (1988) benchmarks.53
Valuable recommendations on SC data analyses have recently been provided.54,55 They suggest that a specific SC study data analytic technique can be chosen on the basis of (1) the study aims and the desired quantification (eg, overall quantification, between-phase quantifications, and randomization), (2) the data characteristics as assessed by visual inspection and the assumptions one is willing to make about the data, and (3) the knowledge and computational resources.54,55Table 1 lists recommended readings and some commonly used resources related to the design and analysis of single-case studies.
QUALITY APPRAISAL TOOLS FOR SINGLE-CASE DESIGN RESEARCH
Quality appraisal tools are important to guide researchers in designing strong experiments and conducting high-quality systematic reviews of the literature. Unfortunately, quality assessment tools for SC studies are relatively novel, ratings across tools demonstrate variability, and there is currently no “gold standard” tool.56Table 2 lists important SC study quality appraisal criteria compiled from the most common scales; when planning studies or reviewing the literature, we recommend readers to consider these criteria. Table 3 lists some commonly used SC quality assessment and reporting tools and references to resources where the tools can be located.
When an established tool is required for systematic review, we recommend use of the WWC tool because it has well-defined criteria and is developed and supported by leading experts in the SC research field in association with the Institute of Education Sciences.18 The WWC documentation provides clear standards and procedures to evaluate the quality of SC research; it assesses the internal validity of SC studies, classifying them as “meeting standards,” “meeting standards with reservations,” or “not meeting standards.”1,18 Only studies classified in the first 2 categories are recommended for further visual analysis. Also, WWC evaluates the evidence of effect, classifying studies into “strong evidence of a causal relation,” “moderate evidence of a causal relation,” or “no evidence of a causal relation.” Effect size should only be calculated for studies providing strong or moderate evidence of a causal relation.
The Single-Case Reporting Guideline In BEhavioural Interventions (SCRIBE) 2016 is another useful SC research tool developed recently to improve the quality of single-case designs.57 SCRIBE consists of a 26-item checklist that researchers need to address while reporting the results of SC studies. This practical checklist allows for critical evaluation of SC studies during study planning, manuscript preparation, and review.
Single-case studies can be designed and analyzed in a rigorous manner that allows researchers strength in assessing causal relationships among interventions and outcomes, and in generalizing their results.2,12 These studies can be strengthened via incorporating replication of findings across multiple study phases, participants, settings, or contexts, and by using randomization of conditions or phase lengths.11 There are a variety of tools that can allow researchers to objectively analyze findings from SC studies.56 Although a variety of quality assessment tools exist for SC studies, they can be difficult to locate and utilize without experience, and different tools can provide variable results. The WWC quality assessment tool is recommended for those aiming to systematically review SC studies.1,18
SC studies, like all types of study designs, have a variety of limitations. First, it can be challenging to collect at least 5 data points in a given study phase. This may be especially true when traveling for data collection is difficult for participants, or during the baseline phase when delaying intervention may not be safe or ethical. Power in SC studies is related to the number of data points gathered for each participant, so it is important to avoid having a limited number of data points.12,58 Second, SC studies are not always designed in a rigorous manner and, thus, may have poor internal validity. This limitation can be overcome by addressing key characteristics that strengthen SC designs (Table 2).1,14,18 Third, SC studies may have poor generalizability. This limitation can be overcome by including a greater number of participants, or units. Fourth, SC studies may require consultation from expert methodologists and statisticians to ensure proper study design and data analysis, especially to manage issues like autocorrelation and variability of data.2 Fifth, although it is recommended to achieve a stable level and rate of performance throughout the baseline, human performance is quite variable and can make this requirement challenging. Finally, the most important validity threat to SC studies is maturation. This challenge must be considered during the design process to strengthen SC studies.1,2,12,58
SC studies can be particularly useful for rehabilitation research. They allow researchers to closely track and report change at the level of the individual. They may require fewer resources and, thus, can allow for high-quality experimental research, even in clinical settings. Furthermore, they provide a tool for assessing causal relationships in populations and settings where large numbers of participants are not accessible. For all of these reasons, SC studies can serve as an effective method for assessing the impact of interventions.
1. Kratochwill TR, Hitchcock J, Horner RH, et al Single case designs technical documentation. What Works Clearinghouse: Procedures and Standards Handbook. http://files.eric.ed.gov/fulltext/ED510743.pdf. Published 2010.
2. Kratochwill TR, Levin JR, eds. Single-Case Intervention Research: Methodological and Statistical Advances. Washington, DC: American Psychological Association; 2014.
3. Barlow DH, Nock MK, Hersen M. Single Case Experimental Designs: Strategies for Studying Behavior Change. 3rd ed. Boston, MA: Allyn & Bacon; 2008.
4. Kazdin AE. Single-Case Research Designs: Methods for Clinical and Applied Settings. 2nd ed. New York, NY: Oxford University Press; 2010.
5. Onghena P. Single-case designs. In: Howell BED, ed. Encyclopedia of Statistics in Behavioral Science. Vol 4. Chichester, England: Wiley; 2005:1850–1854.
6. Tate RL, McDonald S, Perdices M, Togher L, Schultz R, Savage S. Rating the methodological quality of single-subject designs and n-of-1 trials: introducing the Single-Case Experimental Design (SCED) Scale. Neuropsychol Rehabil. 2008;18(4):385–401.
7. Jewell DV. Guide to Evidence-Based PT Practice. 3rd ed. Burlington, MA: Jones & Bartlett Learning; 2014.
8. Sanson-Fisher RW, Bonevski B, Green LW, D'Este C. Limitations of the randomized controlled trial in evaluating population-based health interventions. Am J Prev Med. 2007;33(2):155–161.
9. Cartwright N, Munro E. The limitations of randomized controlled trials in predicting effectiveness. J Eval Clin Pract. 2010;16(2):260–266.
10. Lefkowitz W, Jefferson TC. Medicine at the limits of evidence: the fundamental limitation of the randomized clinical trial and the end of equipoise. J Perinatol. 2014;34(4):249–251.
11. Kratochwill TR, Levin JR. Enhancing the scientific credibility of single-case intervention research: randomization to the rescue. Psychol Methods. 2010;15(2):124–144.
12. Portney LG, Watkins MP. Foundations of Clinical Research: Applications to Practice. Philadelphia, PA: F. A. Davis Company; 2015.
13. Kratochwill TR, Levin JR. Single-Case Research Design
and Analysis. Hillsdale, NJ: Lawrence Erlbaum Associates; 1992.
14. Horner RH, Carr EG, Halle J, McGee G, Odom S, Wolery M. The use of single-subject research
to identify evidence-based practice in special education. Except Children. 2005;71(2):165–179.
15. Dionne M, Martini R. Floor Time Play with a child with autism: a single subject study. Can J Occup Ther Ther. 2011;78(3):196–203.
16. Freeman JA, Gear M, Pauli A, et al The effect of core stability training on balance and mobility in ambulant individuals with multiple sclerosis: a multi-centre series of single case studies. Mult Scler. 2010;16(11):1377–1384.
17. Samuel C, Louis-Dreyfus A, Kaschel R, et al Rehabilitation of very severe unilateral neglect by visuo-spatio-motor cueing: Two single case studies. Neuropsychol Rehabil. 2000;10(4):385–399.
18. What Works Clearinghouse. What Works Clearinghouse: Procedures and Standards Handbook. Version 3.0:1-91. Washington, DC: Institute of Education Sciences.
19. Wellek S, Blettner M. On the proper use of the crossover design in clinical trials: part 18 of a series on evaluation of scientific publications. Dtsch Arztebl Int. 2012;109(15):276–281.
20. Hinderer SR, Lehmann JF, Price R, White O, deLateur BJ, Deitz J. Spasticity in spinal cord injured persons: quantitative effects of baclofen and placebo treatments. Am J Phys Med Rehab. 1990;69(6):311–317.
21. Ferron JM, Moeyaert M, Van den Noortgate NW, Beretvas SN. Estimating causal effects from multiple-baseline studies: implications for design and analysis. Psychol Methods. 2014;19(4):493–510.
22. Mechling LC, Ayres KM, Foster AL, Bryant KJ. Comparing the effects of commercially available and custom-made video prompting for teaching cooking skills to high school students with autism. Rem Spec Educ. 2013;34(6):371–383.
23. Folino A, Ducharme JM, Greenwald N. Temporal effects of antecedent exercise on students' disruptive behaviors: an exploratory study. J School Psychol. 2014;52(5):447–462.
24. Edgington ES, Onghena P. Randomization Tests. Boca Raton, FL: Chapman & Hall/CRC; 2007.
25. Onghena P, Edgington ES. Customization of pain treatments: single-case design and analysis. Clin J Pain. 2005;21:56–68.
26. Todman JB, Dugard P. Single-Case and Small-n Experimental Designs: A Practical Guide to Randomization Tests. Mahwah, NJ: Erlbaum; 2001.
27. Edgington ES. Randomization tests for one-subject operant experiments. J Psychol. 1975;90(1):57–68.
28. Edgington ES. Nonparametric tests for single-case experiments. Single-Case Research Design
and Analysis. Hillsdale, NJ: Erlbaum; 1992.
29. Koehler MJ, Levin JR. Regulated Randomization: a potentially sharper analytical tool for the multiple-baseline design. Psychol Methods. 1998;3(2):206–217.
30. Marascuilo LA, Busk PL. Combining statistics for multiple-baseline AB and replicated ABAB designs across subjects. Behav Assess. 1988;10:1–28.
31. Ferron J, Jones PK. Tests for the visual analysis of response-guided multiple-baseline data. J Exp Educ. 2006;75(1):66–81.
32. Horner RH, Swaminathan H, Sugai G, Smolkowski K. Considerations for the systematic analysis and use of single-case research. Educ Treat Children. 2012;35(2):269–290.
33. Busse RT, Kratochwill TR, Elliott SN. Meta-analysis for single-case consultation outcomes: applications to research and practice. J School Psychol. 1995;33(4):269–285.
34. Neuman SB, McCormick S. Single-Subject Experimental Research: Applications for Literacy. Newark, DE: International Reading Association; 1995.
35. Parker RI, Vannest KJ, Davis JL. Effect size in single-case research: a review of nine nonoverlap techniques. Behav Modif. 2011;35(4):303–322.
36. Parker RI, Vannest K. An improved effect size for single-case research: nonoverlap of all pairs. Behav Ther. 2009;40(4):357–367.
37. Parker RI, Vannest KJ, Brown L. The “improvement rate difference” for single-case research. Except Child. 2009;75(2):135–150.
38. Scruggs TE, Mastropieri MA. PND at 25: past, present, and future trends in summarizing single subject research. Rem Spec Educ. 2013;34:9–19.
39. Manolo R, Solanas A. Percentage of nonoverlapping corrected data. Behav Res Meth. 2009;41:1262–1271.
40. Scruggs TE, Mastropieri MA. Summarizing single-subject research
: issues and applications. Behav Modif. 1998;22:221–242.
41. Callahan CD, Barisa MT. Statistical process control and rehabilitation outcome: the single-subject design reconsidered. Rehabil Psychol. 2005;50(1):24–33.
42. Beretvas SN, Chung H. A review of meta-analyses of single-subject experimental designs: methodological issues and practice. Evid Based Commun Assess Interv. 2008;2(3):129–141.
43. Shadish WR, Rindskopf DM. Methods for evidence-based practice: quantitative synthesis of single-subject designs. New Dir Eval. 2007;113:95–109.
44. Van den Noortgate DNW, Onghena P. Hierarchical linear models for the quantitative integration of effect sizes in single-case research. Behav Res Meth Instrum Comput. 2003;35(1):1–10.
45. Van den Noortgate W, Onghena P. Combining single-case experimental data using hierarchical linear models. Sch Psychol Q. 2003;18(3):325–346.
46. Ferron JM, Bell BA, Hess MR, Rendina-Gobioff G, Hibbard ST. Making treatment effect inferences from multiple-baseline data: the utility of multilevel modeling approaches. Behav Res Meth. 2009;41(2):372–384.
47. Huitema BE, McKean JW. Reduced bias autocorrelation estimation: three Jackknife methods. Educ Psychol Meas. 1994;54(3):654–665.
48. McKnight SD, McKean JW, Huitema BE. A double bootstrap method to analyze linear models with autoregressive error terms. Psychol Methods. 2000;5(1):87–101.
49. Verbeke G, Molenberghs G. Linear Mixed Models for Longitudinal Data. New York, NY: Springer; 2000.
50. Van den Noortgate W, Onghena P. A multilevel meta-analysis of single-subject experimental design studies. Evid Based Commun Assess Interv. 2008;2(3):142–151.
51. Ferron JM, Farmer JL, Owens CM. Estimating individual treatment effects from multiple-baseline data: a Monte Carlo study of multilevel-modeling approaches. Behav Res Meth. 2010;42(4):930–943.
52. Hedges LV, Pustejovsky JE, Shadish WR. A standardized mean difference effect size for single case designs. Res Synth Methods. 2012;3(3):224–239.
53. Cohen J. Statistical Power Analysis for the Behavioral Sciences. Hillsdale, NJ: Lawrence Erlbauw Associates; 1988.
54. Manolov R, Moeyaert M. How can single-case data be analyzed? Software resources, tutorial, and reflections on analysis. Behav Modif. 2017;41(2):179–228.
55. Manolov R, Moeyaert M. Recommendations for choosing single-case data analytical techniques. Behav Ther. 2017;48(1):97–114.
56. Wendt O, Miller B. Quality appraisal of single-subject experimental designs: an overview and comparison of different appraisal tools. Educ Treat Children. 2012;35(2):235–286.
57. Tate RL, Perdices M, Rosenkoetter U, et al The Single-Case Reporting guideline In BEhavioural interventions (SCRIBE) 2016 statement. J School Psychol. 2016;56:133–142.
58. Gast DL, Ledford JR. Single Case Research Methodology: Applications in Special Education and Behavioral Sciences. New York, NY: Routledge; 2014.
59. Feeney TJ, Ylvisaker M. Context-sensitive cognitive-behavioral supports for young children with TBI. J Posit Behav Interv. 2008;10(2):115–128.
60. Lin C-Y, Chang Y-M. Increase in physical activities in kindergarten children with cerebral palsy by employing MaKey-MaKey-based task systems. Res Dev Disabil. 2014;35:1963–1969.
61. Lane-Brown A, Tate R. Evaluation of an intervention for apathy after traumatic brain injury: a multiple-baseline, single-case experimental design. J Head Trauma Rehab. 2010;25(6):459–469.
62. Lieberman LJ, Dunn JM, van der Mars H, McCubbin J. Peer tutors' effects on activity levels of deaf students in inclusive elementary physical education. Adapt Phys Act Q. 2000;17(1):20–39.
63. Lundblom EEG, Woods JJ. Working in the classroom: improving idiom comprehension through classwide peer tutoring. Commun Disord Q. 2012;33(4):202–219.
64. Banda DR, Hart SL, Liu-Gitz L. Impact of training peers and children with autism on social skills during center time activities in inclusive classrooms. Res Autism Spect Dis. 2010;4(4):619–625.
65. Oddo M, Barnett DW, Hawkins RO, Musti-Rao S. Reciprocal peer tutoring and repeated reading: Increasing practicality using student groups. Psychol Schools. 2010;47(8):842–858.
66. Peterson-Brown S, Karich AC, Symons FJ. Examining estimates of effect using non-overlap of all pairs in multiple baseline studies of academic intervention. J Behav Educ. 2012;21(3):203–216.
67. Chen M, Hyppa-Martin JK, Reichle JE, Symons FJ. Comparing single case design overlap-based effect size metrics from studies examining speech generating device interventions. Am J Intellect Dev Disabil. 2016;121(3):169–193.
68. Derakhshandeh F, Nikmaram M, Hosseinabad HH, et al Speech characteristics after articulation therapy in children with cleft palate and velopharyngeal dysfunction: a single case experimental design. Int J Pediatr Otorhinolaryngol. 2016;86:104–113.
69. Klingbeil D, Moeyaert M, Archerm C, Chimnoza TM, Zwolski SA. Examining the efficacy of peer-mediated incremental rehearsal. School Psychol Rev.2017;46:122–140.
70. Asaro-Saddler K, Saddler B, Moeyaert M, Ellis-Robinson T. Effects of a summarizing strategy on written summaries of children with emotional and behavioral disorders. Rem Spec Educ. 2017; doi: 0.1177/0741932516669051.
71. Ingersoll B, Wainer A. Initial efficacy of project ImPACT: a parent-mediated social communication intervention for young children with ASD. J Autism Dev Disord. 2013;43(12):2943–2952.
72. Wade CA, Ortiz C, Gorman BS. Two-session group parent training for bedtime noncompliance in head start preschoolers. Child Fam Behav Ther. 2007;29(3):23–55.
73. Hartman DP, Barrios BA, Wood DD. Principles of behavioral observation. In:Haynes SN, Hieby EM, eds. Comprehensive Handbook of Psychological Assessment. Behavioral Assessment. Vol 3. New York, NY: Wiley; 2004.
74. Reichow B, Volkmar F, Cicchetti D. Development of the evaluative method for evaluating and determining evidence-based practices in autism. J Autism Dev Disord. 2008;38(7):1311–1319.
75. Simeonsson R, Bailey D. Evaluating programme impact: levels of certainty. In: IDMRB ed. Early Intervention Studies for Young Children With Special Needs. London, England: Chapman & Hall; 1991:280–296.
76. Schlosser RW, Sigafoos J, Belfiore P. EVIDAAC Comparative Single-Subject Experimental Design Scale (CSSEDARS). Published 2009. Accessed November 20, 2016 from http://http://www.evidaac.com
77. Logan LR, Hickman RR, Harris SR, Heriza CB. Single-subject research
design: recommendations for levels of evidence and quality rating. Dev Med Child Neurol. 2008;50:99–103.
78. Rohatgi A. WebPlotDigitizer User Manual Version 3.4. Published 2014. Accessed from http://arohatgi.info/WebPlotDigitizer/userManual.pdf
Keywords:© 2017 Academy of Neurologic Physical Therapy, APTA
n-of-1 studies; quality assessment; research design; single-subject research