Secondary Logo

Journal Logo

Single-Case Design, Analysis, and Quality Assessment for Intervention Research

Lobo, Michele A. PT, PhD; Moeyaert, Mariola PhD; Baraldi Cunha, Andrea PT, PhD; Babik, Iryna PhD

Journal of Neurologic Physical Therapy: July 2017 - Volume 41 - Issue 3 - p 187–197
doi: 10.1097/NPT.0000000000000187
Special Interest Articles

Background and Purpose: The purpose of this article is to describe single-case studies and contrast them with case studies and randomized clinical trials. We highlight current research designs, analysis techniques, and quality appraisal tools relevant for single-case rehabilitation research.

Summary of Key Points: Single-case studies can provide a viable alternative to large group studies such as randomized clinical trials. Single-case studies involve repeated measures and manipulation of an independent variable. They can be designed to have strong internal validity for assessing causal relationships between interventions and outcomes, as well as external validity for generalizability of results, particularly when the study designs incorporate replication, randomization, and multiple participants. Single-case studies should not be confused with case studies/series (ie, case reports), which are reports of clinical management of a patient or a small series of patients.

Recommendations for Clinical Practice: When rigorously designed, single-case studies can be particularly useful experimental designs in a variety of situations, such as when research resources are limited, studied conditions have low incidences, or when examining effects of novel or expensive interventions. Readers will be directed to examples from the published literature in which these techniques have been discussed, evaluated for quality, and implemented.

Biomechanics & Movement Science Program, Department of Physical Therapy, University of Delaware, Newark, Delaware (M.A.L., A.B.C., I.B.); and Division of Educational Psychology & Methodology, State University of New York at Albany, Albany, New York (M.M.).

Correspondence: Michele A. Lobo, PT, PhD, Biomechanics & Movement Science Program, Department of Physical Therapy, University of Delaware, Newark, DE 19713 (

This research was supported by the National Institute of Health, Eunice Kennedy Shriver National Institute of Child Health & Human Development (1R21HD076092-01A1, Lobo PI), and the Delaware Economic Development Office (Grant #109).

Some of the information in this article was presented at the IV Step Meeting in Columbus, Ohio, June 2016.

The authors declare no conflict of interest.

Back to Top | Article Outline


In this special interest article we present current tools and techniques relevant for single-case rehabilitation research. Single-case (SC) studies have been identified by a variety of names, including “n of 1 studies” and “single-subject” studies. The term “single-case study” is preferred over the previously mentioned terms because previous terms suggest these studies include only 1 participant. In fact, as discussed later, for purposes of replication and improved generalizability, the strongest SC studies commonly include more than 1 participant.

A SC study should not be confused with a “case study/series” (also called “case report”). In a typical case study/series, a single patient or small series of patients is involved, but there is not a purposeful manipulation of an independent variable, nor are there necessarily repeated measures. Most case studies/series are reported in a narrative way, whereas results of SC studies are presented numerically or graphically.1,2 This article defines SC studies, contrasts them with randomized clinical trials, discusses how they can be used to scientifically test hypotheses, and highlights current research designs, analysis techniques, and quality appraisal tools that may be useful for rehabilitation researchers.

In SC studies, measurements of outcome (dependent variables) are recorded repeatedly for individual participants across time and varying levels of an intervention (independent variables).1–5 These varying levels of intervention are referred to as “phases,” with 1 phase serving as a baseline or comparison, so each participant serves as his/her own control.2 In contrast to case studies and case series in which participants are observed across time without experimental manipulation of the independent variable, SC studies employ systematic manipulation of the independent variable to allow for hypothesis testing.1,6 As a result, SC studies allow for rigorous experimental evaluation of intervention effects and provide a strong basis for establishing causal inferences. Advances in design and analysis techniques for SC studies observed in recent decades have made SC studies increasingly popular in educational and psychological research. Yet, the authors believe SC studies have been undervalued in rehabilitation research, where randomized clinical trials (RCTs) are typically recommended as the optimal research design to answer questions related to interventions.7 In reality, there are advantages and disadvantages to both SC studies and RCTs that should be carefully considered to select the best design to answer individual research questions. Although there are a variety of other research designs that could be utilized in rehabilitation research, only SC studies and RCTs are discussed here because SC studies are the focus of this article and RCTs are the most highly recommended design for intervention studies.7

When designed and conducted properly, RCTs offer strong evidence that changes in outcomes may be related to provision of an intervention. However, RCTs require monetary, time, and personnel resources that many researchers, especially those in clinical settings, may not have available.8 RCTs also require access to large numbers of consenting participants who meet strict inclusion and exclusion criteria that can limit variability of the sample and generalizability of results.9 The requirement for large participant numbers may make RCTs difficult to perform in many settings, such as rural and suburban settings, and for many populations, such as those with diagnoses marked by lower prevalence.8 To rely exclusively on RCTs has the potential to result in bodies of research that are skewed to address the needs of some individuals whereas neglecting the needs of others. RCTs aim to include a large number of participants and to use random group assignment to create study groups that are similar to one another in terms of all potential confounding variables, but it is challenging to identify all confounding variables. Finally, the results of RCTs are typically presented in terms of group means and standard deviations that may not represent true performance of any one participant.10 This can present as a challenge for clinicians aiming to translate and implement these group findings at the level of the individual.

SC studies can provide a scientifically rigorous alternative to RCTs for experimentally determining the effectiveness of interventions.1,2 SC studies can assess a variety of research questions, settings, cases, independent variables, and outcomes.11 There are many benefits to SC studies that make them appealing for intervention research. SC studies may require fewer resources than RCTs and can be performed in settings and with populations that do not allow for large numbers of participants.1,2 In SC studies, each participant serves as his/her own comparison, thus controlling for many confounding variables that can impact outcome in rehabilitation research, such as gender, age, socioeconomic level, cognition, home environment, and concurrent interventions.2,11 Results can be analyzed and presented to determine whether interventions resulted in changes at the level of the individual, the level at which rehabilitation professionals intervene.2,12 When properly designed and executed, SC studies can demonstrate strong internal validity to determine the likelihood of a causal relationship between the intervention and outcomes and external validity to generalize the findings to broader settings and populations.2,12,13

Back to Top | Article Outline


There are a variety of SC designs that can be used to study the effectiveness of interventions. Here we discuss (1) AB designs, (2) reversal designs, (3) multiple baseline designs, and (4) alternating treatment designs, as well as ways replication and randomization techniques can be used to improve internal validity of all of these designs.1–3,12–14

The simplest of these designs is the AB design15 (Figure 1). This design involves repeated measurement of outcome variables throughout a baseline control/comparison phase (A) and then throughout an intervention phase (B). When possible, it is recommended that a stable level and/or rate of change in performance be observed within the baseline phase before transitioning into the intervention phase.2 As with all SC designs, it is also recommended that there be a minimum of 5 data points in each phase.1,2 There is no randomization or replication of the baseline or intervention phases in the basic AB design.2 Therefore, AB designs have problems with internal validity and generalizability of results.12 They are weak in establishing causality because changes in outcome variables could be related to a variety of other factors, including maturation, experience, learning, and practice effects.2,12 Sample data from a single-case AB study performed to assess the impact of Floor Play intervention on social interaction and communication skills for a child with autism15 are shown in Figure 1.

Figure 1

Figure 1

If an intervention does not have carryover effects, it is recommended to use a reversal design.2 For example, a reversal A1BA2 design16 (Figure 2) includes alternation of the baseline and intervention phases, whereas a reversal A1B1A2B2 design17 (Figure 3) consists of alternation of 2 baseline (A1, A2) and 2 intervention (B1, B2) phases. Incorporating at least 4 phases in the reversal design (ie, A1B1A2B2 or A1B1A2B2A3B3...) allows for a stronger determination of a causal relationship between the intervention and outcome variables because the relationship can be demonstrated across at least 3 different points in time–-change in outcome from A1 to B1, from B1 to A2, and from A2 to B2.18 Before using this design, however, researchers must determine that it is safe and ethical to withdraw the intervention, especially in cases where the intervention is effective and necessary.12

Figure 2

Figure 2

Figure 3

Figure 3

A recent study used an ABA reversal SC study to determine the effectiveness of core stability training in 8 participants with multiple sclerosis.16 During the first 4 weekly data collections, the researchers ensured a stable baseline, which was followed by 8 weekly intervention data points, and concluded with 4 weekly withdrawal data points. Intervention significantly improved participants' walking and reaching performance (Figure 2).16 This A1BA2 design could have been strengthened by the addition of a second intervention phase for replication (A1B1A2B2). For instance, a single-case A1B1A2B2 withdrawal design aimed to assess the efficacy of rehabilitation using visuo-spatio-motor cueing for 2 participants with severe unilateral neglect after a severe right hemisphere stroke.17 Each phase included 8 data points. Statistically significant intervention-related improvement was observed, suggesting that visuo-spatio-motor cueing might be promising for treating individuals with very severe neglect (Figure 3).17

The reversal design can also incorporate a cross-over design where each participant experiences more than 1 type of intervention. For instance, a B1C1B2C2 design could be used to study the effects of 2 different interventions (B and C) on outcome measures. Challenges with including more than 1 intervention involve potential carryover effects from earlier interventions and order effects that may impact the measured effectiveness of the interventions.2,12 Including multiple participants and randomizing the order of intervention phase presentations are tools to help control for these types of effects.19

When an intervention permanently changes an individual's ability, a return-to-baseline performance is not feasible and reversal designs are not appropriate. Multiple baseline designs (MBDs) are useful in these situations (Figure 4).20 Multiple baseline designs feature staggered introduction of the intervention across time: each participant is randomly assigned to 1 of at least 3 experimental conditions characterized by the length of the baseline phase.21 These studies involve more than 1 participant, thus functioning as SC studies with replication across participants. Staggered introduction of the intervention allows for separation of intervention effects from those of maturation, experience, learning, and practice. For example, a multiple baseline SC study was used to investigate the effect of an antispasticity baclofen medication on stiffness in 5 adult males with spinal cord injury.20 The subjects were randomly assigned to receive 5 to 9 baseline data points with a placebo treatment before the initiation of the intervention phase with the medication. Both participants and assessors were blind to the experimental condition. The results suggested that baclofen might not be a universal treatment choice for all individuals with spasticity resulting from a traumatic spinal cord injury (Figure 4).20

Figure 4

Figure 4

The impact of 2 or more interventions can also be assessed via alternating treatment designs (ATDs). In ATDs, after establishing the baseline, the experimenter exposes subjects to different intervention conditions administered in close proximity for equal intervals (Figure 5).22 ATDs are prone to “carryover effects” when the effects of 1 intervention influence the observed outcomes of another intervention.1 As a result, such designs introduce unique challenges when attempting to determine the effects of any 1 intervention and have been less commonly utilized in rehabilitation. An ATD was used to monitor disruptive behaviors in the school setting throughout a baseline followed by an alternating treatment phase with randomized presentation of a control condition or an exercise condition.23 Results showed that 30 minutes of moderate to intense physical activity decreased behavioral disruptions through 90 minutes after the intervention.23 An ATD was also used to compare the effects of commercially available and custom-made video prompts on the performance of multistep cooking tasks in 4 participants with autism.22 Results showed that participants independently performed more steps with the custom-made video prompts (Figure 5).22

Figure 5

Figure 5

Regardless of the SC study design, replication and randomization should be incorporated when possible to improve internal and external validity.11 The reversal design is an example of replication across study phases. The minimum number of phase replications needed to meet quality standards is 3 (A1B1A2B2), but having 4 or more replications is highly recommended (A1B1A2B2A3...).11,14 In cases when interventions aim to produce lasting changes in participants' abilities, replication of findings may be demonstrated by replicating intervention effects across multiple participants (as in multiple-participant AB designs), or across multiple settings, tasks, or service providers. When the results of an intervention are replicated across multiple reversals, participants, and/or contexts, there is an increased likelihood that a causal relationship exists between the intervention and the outcome.2,12

Randomization should be incorporated in SC studies to improve internal validity and the ability to assess for causal relationships among interventions and outcomes.11 In contrast to traditional group designs, SC studies often do not have multiple participants or units that can be randomly assigned to different intervention conditions. Instead, in randomized phase-order designs, the sequence of phases is randomized. Simple or block randomization is possible. For example, with simple randomization for an A1B1A2B2 design, the A and B conditions are treated as separate units and are randomly assigned to be administered for each of the predefined data collection points. As a result, any combination of A-B sequences is possible without restrictions on the number of times each condition is administered or regard for repetitions of conditions (eg, A1B1B2A2B3B4B5A3B6A4A5A6). With block randomization for an A1B1A2B2 design, 2 conditions (eg, A and B) would be blocked into a single unit (AB or BA), randomization of which to different periods would ensure that each condition appears in the resulting sequence more than 2 times (eg, A1B1B2A2A3B3A4B4). Note that AB and reversal designs require that the baseline (A) always precedes the first intervention (B), which should be accounted for in the randomization scheme.2,11

In randomized phase start-point designs, the lengths of the A and B phases can be randomized.2,11,24–26 For example, for an AB design, researchers could specify the number of time points at which outcome data will be collected (eg, 20), define the minimum number of data points desired in each phase (eg, 4 for A, 3 for B), and then randomize the initiation of the intervention so that it occurs anywhere between the remaining time points (points 5 and 17 in the current example).27,28 For multiple baseline designs, a dual-randomization or “regulated randomization” procedure has been recommended.29 If multiple baseline randomization depends solely on chance, it could be the case that all units are assigned to begin intervention at points not really separated in time.30 Such randomly selected initiation of the intervention would result in the drastic reduction of the discriminant and internal validity of the study.29 To eliminate this issue, investigators should first specify appropriate intervals between the start points for different units, then randomly select from those intervals, and finally randomly assign each unit to a start point.29

Back to Top | Article Outline


The What Works Clearinghouse (WWC) single-case design technical documentation provides an excellent overview of appropriate SC study analysis techniques to evaluate the effectiveness of intervention effects.1,18 First, visual analyses are recommended to determine whether there is a functional relationship between the intervention and the outcome. Second, if evidence for a functional effect is present, the visual analysis is supplemented with quantitative analysis methods evaluating the magnitude of the intervention effect. Third, effect sizes are combined across cases to estimate overall average intervention effects, which contribute to evidence-based practice, theory, and future applications.2,18

Back to Top | Article Outline

Visual Analysis

Traditionally, SC study data are presented graphically. When more than 1 participant engages in a study, a spaghetti plot showing all of their data in the same figure can be helpful for visualization. Visual analysis of graphed data has been the traditional method for evaluating treatment effects in SC research.1,12,31,32 The visual analysis involves evaluating level, trend, and stability of the data within each phase (ie, within-phase data examination) followed by examination of the immediacy of effect, consistency of data patterns, and overlap of data between baseline and intervention phases (ie, between-phase comparisons). When the changes (and/or variability) in level are in the desired direction, are immediate, readily discernible, and maintained over time, it is concluded that the changes in behavior across phases result from the implemented treatment and are indicative of improvement.33 Three demonstrations of an intervention effect are necessary for establishing a functional relationship.1

Back to Top | Article Outline

Within-Phase Examination

Level, trend, and stability of the data within each phase are evaluated. Mean and/or median can be used to report the level, and trend can be evaluated by determining whether the data points are monotonically increasing or decreasing. Within-phase stability can be evaluated by calculating the percentage of data points within 15% of the phase median (or mean). The stability criterion is satisfied if about 85% (80%–90%) of the data in a phase fall within a 15% range of the median (or average) of all data points for that phase.34

Back to Top | Article Outline

Between-Phase Examination

Immediacy of effect, consistency of data patterns, and overlap of data between baseline and intervention phases are evaluated next. For this, several nonoverlap indices have been proposed that all quantify the proportion of measurements in the intervention phase not overlapping with the baseline measurements.35 Nonoverlap statistics are typically scaled as percent from 0 to 100, or as a proportion from 0 to 1. Here, we briefly discuss the nonoverlap of all pairs (NAP),36the extended celeration line (ECL), the improvement rate difference (IRD),37 and the TauU, and the TauU-adjusted, TauUadj,35 as these are the most recent and complete techniques. We also examine the percentage of nonoverlapping data (PND)38 and the two standard deviations band method, as these are frequently used techniques. In addition, we include the percentage of nonoverlapping corrected data (PNCD)–-an index applying to the PND after controlling for baseline trend.39

Back to Top | Article Outline

Nonoverlap of All Pairs

Each baseline observation can be paired with each intervention phase observation to make n pairs (ie, N = nA × nB). Count the number of overlapping pairs, n0, counting all ties as 0.5. Then define the percent of the pairs that show no overlap. Alternatively, one can count the number of positive (P), negative (N), and tied (T) pairs2,36:

Back to Top | Article Outline

Extended Celeration Line

ECL or split-middle line allows control for a positive phase A trend. Nonoverlap is defined as the proportion of phase B (nb) data that are above the median trend plotted from phase A data (nBAbove Median trend A), but then extended into phase B:

As a consequence, this method depends on a straight line and makes an assumption of linearity in the baseline.2,12

Back to Top | Article Outline

Improvement Rate Difference

This analysis is conceptualized as the difference in improvement rates (IR) between baseline (IRB) and intervention phases (IRT).38 The IR for each phase is defined as the number of “improved data points” divided by the total data points in that phase. Improvement rate difference, commonly employed in medical group research under the name of “risk reduction” or “risk difference,” attempts to provide an intuitive interpretation for nonoverlap and to make use of an established, respected effect size, IRB − IRT, or the difference between 2 proportions.37

Back to Top | Article Outline

TauU and TauUadj

Each baseline observation can be paired with each intervention phase observation to make n pairs (ie, n = nA × nB). Count the number of positive (P), negative (N), and tied (T) pairs, and use the following formula:

The TauUadj is an adjustment of TauU for monotonic trend in baseline. Each baseline observation can be paired with each intervention phase observation to make n pairs (ie, n = nA × nB). Each baseline observation can be paired with all later baseline observations (nA × (nA − 1)/2).2,35 Then the baseline trend can be computed:

; Strend = PANA.

Online calculators might assist researchers in obtaining the TauU and TauU adjusted coefficients (

Back to Top | Article Outline

Percentage of Nonoverlapping Data

If anticipating an increase in the outcome, locate the highest data point in the baseline phase and then calculate the percent of the intervention phase data points that exceed it. If anticipating a decrease in the outcome, find the lowest data point in the baseline phase and then calculate the percent of the treatment phase data points that are below it:

. A PND less than 50 would mark no observed effect, PND = 50 to 70 signifies a questionable effect, and PND more than 70 suggests the intervention was effective.40 The percentage of nonoverlapping corrected was proposed in 2009 as an extension of the PND.39 Before applying the PND, a data correction procedure is applied, eliminating preexisting baseline trend.38

Back to Top | Article Outline

Two Standard Deviation Band Method

When the stability criterion described earlier is met within phases, it is possible to apply the 2-standard deviation band method.12,41 First, the mean of the data for a specific condition is calculated and represented with a solid line. In the next step, the standard deviation of the same data is computed, and 2 dashed lines are represented: one located 2 standard deviations above the mean and the other 2 standard deviations below. For normally distributed data, few points (<5%) are expected to be outside the 2-standard deviation bands if there is no change in the outcome score because of the intervention. However, this method is not considered a formal statistical procedure, as the data cannot typically be assumed to be normal, continuous, or independent.41

Back to Top | Article Outline

Statistical Analysis

If the visual analysis indicates a functional relationship (ie, 3 demonstrations of the effectiveness of the intervention effect), it is recommended to proceed with the quantitative analyses, reflecting the magnitude of the intervention effect. First, effect sizes are calculated for each participant (individual-level analysis). Moreover, if the research interest lies in the generalizability of the effect size across participants, effect sizes can be combined across cases to achieve an overall average effect size estimate (across-case effect size).

Note that quantitative analysis methods are still being developed in the domain of SC research1 and statistical challenges of producing an acceptable measure of treatment effect remain.14,42,43 Therefore, the WWC standards strongly recommend conducting sensitivity analysis and reporting multiple effect size estimators. If consistency across different effect size estimators is identified, there is stronger evidence for the effectiveness of the treatment.1,18

Back to Top | Article Outline

Individual-Level Effect Size Analysis

The most common effect sizes recommended for SC analysis are (1) standardized mean difference Cohen's d; (2) standardized mean difference with correction for small sample sizes Hedges' g; and (3) the regression-based approach, which has the most potential and is strongly recommended by the WWC standards.1,44,45Cohen's d can be calculated using following formula:

, with

being the baseline mean,

being the treatment mean, and sp indicating the pooled within-case standard deviation. Hedges' g is an extension of Cohen's d, recommended in the context of SC studies, as it corrects for small sample sizes. The piecewise regression-based approach does not reflect only the immediate intervention effect but also the intervention effect across time:

i stands for the measurement occasion (i = 0, 1, ..., I), ρ indicates the autocorrelation parameter. If ρ is positive, the errors closer in time are more similar; if ρ is negative, the errors closer in time are more different, and if ρ equals 0, there is no correlation between the errors. The dependent variable is regressed on a time indicator, T, which is centered around the first observation of the intervention phase, D, a dummy variable for the intervention phase, and an interaction term of these variables. The equation shows that the expected score,

, equals β0 + β1Ti in the baseline phase, and (β0 + β2) + (β1 + β3)Ti in the intervention phase. β0, therefore, indicates the expected baseline level at the start of the intervention phase (when T = 0), whereas β1 marks the linear time trend in the baseline scores. The coefficient β2 can then be interpreted as an immediate effect of the intervention on the outcome, whereas β3 signifies the effect of the intervention across time. The ei's are residuals assumed to be normally distributed around a mean of 0 with a variance of σ2e. The assumption of independence of errors is usually not met in the context of SC studies because repeated measures are obtained within a person. As a consequence, it can be the case that the residuals are autocorrelated, meaning that errors closer in time are more related to each other compared with errors further away in time.46–48 As a consequence, a lag-1 autocorrelation is appropriate (taking into account the correlation between 2 consecutive errors: ei and ei−1; for more details see Verbeke and Molenberghs.49

Back to Top | Article Outline

Across-Case Effect Sizes

Two-level modeling to estimate the intervention effects across cases can be used to evaluate across-case effect sizes.44,45,50 Multilevel modeling is recommended by the WWC standards because it takes the hierarchical nature of SC studies into account: measurements are nested within cases and cases, in turn, are nested within studies. By conducting a multilevel analysis, important research questions can be addressed (which cannot be answered by single-level analysis of SC study data), such as (1) What is the magnitude of the average treatment effect across cases? (2) What is the magnitude and direction of the case-specific intervention effect? (3) How much does the treatment effect vary within cases and across cases? (4) Does a case and/or study-level predictor influence the treatment's effect? The 2-level model has been validated in previous research using extensive simulation studies.45,46,51 The 2-level model appears to have sufficient power (>0.80) to detect large treatment effects in at least 6 participants with 6 measurements.21

Furthermore, to estimate the across-case effect sizes, the HPS (Hedges, Pustejovsky, and Shadish), or single-case educational design (SCEdD)-specific mean difference, index can be calculated.52 This is a standardized mean difference index specifically designed for SCEdD data, with the aim of making it comparable to Cohen's d of group-comparison designs. The standard deviation takes into account both within-participant and between-participant variability, and is typically used to get an across-case estimator for a standardized change in level. The advantage of using the HPS across-case effect size estimator is that it is directly comparable with Cohen's d for group comparison research, thus enabling the use of Cohen's (1988) benchmarks.53

Valuable recommendations on SC data analyses have recently been provided.54,55 They suggest that a specific SC study data analytic technique can be chosen on the basis of (1) the study aims and the desired quantification (eg, overall quantification, between-phase quantifications, and randomization), (2) the data characteristics as assessed by visual inspection and the assumptions one is willing to make about the data, and (3) the knowledge and computational resources.54,55Table 1 lists recommended readings and some commonly used resources related to the design and analysis of single-case studies.

Table 1

Table 1

Back to Top | Article Outline


Quality appraisal tools are important to guide researchers in designing strong experiments and conducting high-quality systematic reviews of the literature. Unfortunately, quality assessment tools for SC studies are relatively novel, ratings across tools demonstrate variability, and there is currently no “gold standard” tool.56Table 2 lists important SC study quality appraisal criteria compiled from the most common scales; when planning studies or reviewing the literature, we recommend readers to consider these criteria. Table 3 lists some commonly used SC quality assessment and reporting tools and references to resources where the tools can be located.

Table 2

Table 2

Table 3

Table 3

When an established tool is required for systematic review, we recommend use of the WWC tool because it has well-defined criteria and is developed and supported by leading experts in the SC research field in association with the Institute of Education Sciences.18 The WWC documentation provides clear standards and procedures to evaluate the quality of SC research; it assesses the internal validity of SC studies, classifying them as “meeting standards,” “meeting standards with reservations,” or “not meeting standards.”1,18 Only studies classified in the first 2 categories are recommended for further visual analysis. Also, WWC evaluates the evidence of effect, classifying studies into “strong evidence of a causal relation,” “moderate evidence of a causal relation,” or “no evidence of a causal relation.” Effect size should only be calculated for studies providing strong or moderate evidence of a causal relation.

The Single-Case Reporting Guideline In BEhavioural Interventions (SCRIBE) 2016 is another useful SC research tool developed recently to improve the quality of single-case designs.57 SCRIBE consists of a 26-item checklist that researchers need to address while reporting the results of SC studies. This practical checklist allows for critical evaluation of SC studies during study planning, manuscript preparation, and review.

Back to Top | Article Outline


Single-case studies can be designed and analyzed in a rigorous manner that allows researchers strength in assessing causal relationships among interventions and outcomes, and in generalizing their results.2,12 These studies can be strengthened via incorporating replication of findings across multiple study phases, participants, settings, or contexts, and by using randomization of conditions or phase lengths.11 There are a variety of tools that can allow researchers to objectively analyze findings from SC studies.56 Although a variety of quality assessment tools exist for SC studies, they can be difficult to locate and utilize without experience, and different tools can provide variable results. The WWC quality assessment tool is recommended for those aiming to systematically review SC studies.1,18

SC studies, like all types of study designs, have a variety of limitations. First, it can be challenging to collect at least 5 data points in a given study phase. This may be especially true when traveling for data collection is difficult for participants, or during the baseline phase when delaying intervention may not be safe or ethical. Power in SC studies is related to the number of data points gathered for each participant, so it is important to avoid having a limited number of data points.12,58 Second, SC studies are not always designed in a rigorous manner and, thus, may have poor internal validity. This limitation can be overcome by addressing key characteristics that strengthen SC designs (Table 2).1,14,18 Third, SC studies may have poor generalizability. This limitation can be overcome by including a greater number of participants, or units. Fourth, SC studies may require consultation from expert methodologists and statisticians to ensure proper study design and data analysis, especially to manage issues like autocorrelation and variability of data.2 Fifth, although it is recommended to achieve a stable level and rate of performance throughout the baseline, human performance is quite variable and can make this requirement challenging. Finally, the most important validity threat to SC studies is maturation. This challenge must be considered during the design process to strengthen SC studies.1,2,12,58

SC studies can be particularly useful for rehabilitation research. They allow researchers to closely track and report change at the level of the individual. They may require fewer resources and, thus, can allow for high-quality experimental research, even in clinical settings. Furthermore, they provide a tool for assessing causal relationships in populations and settings where large numbers of participants are not accessible. For all of these reasons, SC studies can serve as an effective method for assessing the impact of interventions.

Back to Top | Article Outline


1. Kratochwill TR, Hitchcock J, Horner RH, et al Single case designs technical documentation. What Works Clearinghouse: Procedures and Standards Handbook. Published 2010.
2. Kratochwill TR, Levin JR, eds. Single-Case Intervention Research: Methodological and Statistical Advances. Washington, DC: American Psychological Association; 2014.
3. Barlow DH, Nock MK, Hersen M. Single Case Experimental Designs: Strategies for Studying Behavior Change. 3rd ed. Boston, MA: Allyn & Bacon; 2008.
4. Kazdin AE. Single-Case Research Designs: Methods for Clinical and Applied Settings. 2nd ed. New York, NY: Oxford University Press; 2010.
5. Onghena P. Single-case designs. In: Howell BED, ed. Encyclopedia of Statistics in Behavioral Science. Vol 4. Chichester, England: Wiley; 2005:1850–1854.
6. Tate RL, McDonald S, Perdices M, Togher L, Schultz R, Savage S. Rating the methodological quality of single-subject designs and n-of-1 trials: introducing the Single-Case Experimental Design (SCED) Scale. Neuropsychol Rehabil. 2008;18(4):385–401.
7. Jewell DV. Guide to Evidence-Based PT Practice. 3rd ed. Burlington, MA: Jones & Bartlett Learning; 2014.
8. Sanson-Fisher RW, Bonevski B, Green LW, D'Este C. Limitations of the randomized controlled trial in evaluating population-based health interventions. Am J Prev Med. 2007;33(2):155–161.
9. Cartwright N, Munro E. The limitations of randomized controlled trials in predicting effectiveness. J Eval Clin Pract. 2010;16(2):260–266.
10. Lefkowitz W, Jefferson TC. Medicine at the limits of evidence: the fundamental limitation of the randomized clinical trial and the end of equipoise. J Perinatol. 2014;34(4):249–251.
11. Kratochwill TR, Levin JR. Enhancing the scientific credibility of single-case intervention research: randomization to the rescue. Psychol Methods. 2010;15(2):124–144.
12. Portney LG, Watkins MP. Foundations of Clinical Research: Applications to Practice. Philadelphia, PA: F. A. Davis Company; 2015.
13. Kratochwill TR, Levin JR. Single-Case Research Design and Analysis. Hillsdale, NJ: Lawrence Erlbaum Associates; 1992.
14. Horner RH, Carr EG, Halle J, McGee G, Odom S, Wolery M. The use of single-subject research to identify evidence-based practice in special education. Except Children. 2005;71(2):165–179.
15. Dionne M, Martini R. Floor Time Play with a child with autism: a single subject study. Can J Occup Ther Ther. 2011;78(3):196–203.
16. Freeman JA, Gear M, Pauli A, et al The effect of core stability training on balance and mobility in ambulant individuals with multiple sclerosis: a multi-centre series of single case studies. Mult Scler. 2010;16(11):1377–1384.
17. Samuel C, Louis-Dreyfus A, Kaschel R, et al Rehabilitation of very severe unilateral neglect by visuo-spatio-motor cueing: Two single case studies. Neuropsychol Rehabil. 2000;10(4):385–399.
18. What Works Clearinghouse. What Works Clearinghouse: Procedures and Standards Handbook. Version 3.0:1-91. Washington, DC: Institute of Education Sciences.
19. Wellek S, Blettner M. On the proper use of the crossover design in clinical trials: part 18 of a series on evaluation of scientific publications. Dtsch Arztebl Int. 2012;109(15):276–281.
20. Hinderer SR, Lehmann JF, Price R, White O, deLateur BJ, Deitz J. Spasticity in spinal cord injured persons: quantitative effects of baclofen and placebo treatments. Am J Phys Med Rehab. 1990;69(6):311–317.
21. Ferron JM, Moeyaert M, Van den Noortgate NW, Beretvas SN. Estimating causal effects from multiple-baseline studies: implications for design and analysis. Psychol Methods. 2014;19(4):493–510.
22. Mechling LC, Ayres KM, Foster AL, Bryant KJ. Comparing the effects of commercially available and custom-made video prompting for teaching cooking skills to high school students with autism. Rem Spec Educ. 2013;34(6):371–383.
23. Folino A, Ducharme JM, Greenwald N. Temporal effects of antecedent exercise on students' disruptive behaviors: an exploratory study. J School Psychol. 2014;52(5):447–462.
24. Edgington ES, Onghena P. Randomization Tests. Boca Raton, FL: Chapman & Hall/CRC; 2007.
25. Onghena P, Edgington ES. Customization of pain treatments: single-case design and analysis. Clin J Pain. 2005;21:56–68.
26. Todman JB, Dugard P. Single-Case and Small-n Experimental Designs: A Practical Guide to Randomization Tests. Mahwah, NJ: Erlbaum; 2001.
27. Edgington ES. Randomization tests for one-subject operant experiments. J Psychol. 1975;90(1):57–68.
28. Edgington ES. Nonparametric tests for single-case experiments. Single-Case Research Design and Analysis. Hillsdale, NJ: Erlbaum; 1992.
29. Koehler MJ, Levin JR. Regulated Randomization: a potentially sharper analytical tool for the multiple-baseline design. Psychol Methods. 1998;3(2):206–217.
30. Marascuilo LA, Busk PL. Combining statistics for multiple-baseline AB and replicated ABAB designs across subjects. Behav Assess. 1988;10:1–28.
31. Ferron J, Jones PK. Tests for the visual analysis of response-guided multiple-baseline data. J Exp Educ. 2006;75(1):66–81.
32. Horner RH, Swaminathan H, Sugai G, Smolkowski K. Considerations for the systematic analysis and use of single-case research. Educ Treat Children. 2012;35(2):269–290.
33. Busse RT, Kratochwill TR, Elliott SN. Meta-analysis for single-case consultation outcomes: applications to research and practice. J School Psychol. 1995;33(4):269–285.
34. Neuman SB, McCormick S. Single-Subject Experimental Research: Applications for Literacy. Newark, DE: International Reading Association; 1995.
35. Parker RI, Vannest KJ, Davis JL. Effect size in single-case research: a review of nine nonoverlap techniques. Behav Modif. 2011;35(4):303–322.
36. Parker RI, Vannest K. An improved effect size for single-case research: nonoverlap of all pairs. Behav Ther. 2009;40(4):357–367.
37. Parker RI, Vannest KJ, Brown L. The “improvement rate difference” for single-case research. Except Child. 2009;75(2):135–150.
38. Scruggs TE, Mastropieri MA. PND at 25: past, present, and future trends in summarizing single subject research. Rem Spec Educ. 2013;34:9–19.
39. Manolo R, Solanas A. Percentage of nonoverlapping corrected data. Behav Res Meth. 2009;41:1262–1271.
40. Scruggs TE, Mastropieri MA. Summarizing single-subject research: issues and applications. Behav Modif. 1998;22:221–242.
41. Callahan CD, Barisa MT. Statistical process control and rehabilitation outcome: the single-subject design reconsidered. Rehabil Psychol. 2005;50(1):24–33.
42. Beretvas SN, Chung H. A review of meta-analyses of single-subject experimental designs: methodological issues and practice. Evid Based Commun Assess Interv. 2008;2(3):129–141.
43. Shadish WR, Rindskopf DM. Methods for evidence-based practice: quantitative synthesis of single-subject designs. New Dir Eval. 2007;113:95–109.
44. Van den Noortgate DNW, Onghena P. Hierarchical linear models for the quantitative integration of effect sizes in single-case research. Behav Res Meth Instrum Comput. 2003;35(1):1–10.
45. Van den Noortgate W, Onghena P. Combining single-case experimental data using hierarchical linear models. Sch Psychol Q. 2003;18(3):325–346.
46. Ferron JM, Bell BA, Hess MR, Rendina-Gobioff G, Hibbard ST. Making treatment effect inferences from multiple-baseline data: the utility of multilevel modeling approaches. Behav Res Meth. 2009;41(2):372–384.
47. Huitema BE, McKean JW. Reduced bias autocorrelation estimation: three Jackknife methods. Educ Psychol Meas. 1994;54(3):654–665.
48. McKnight SD, McKean JW, Huitema BE. A double bootstrap method to analyze linear models with autoregressive error terms. Psychol Methods. 2000;5(1):87–101.
49. Verbeke G, Molenberghs G. Linear Mixed Models for Longitudinal Data. New York, NY: Springer; 2000.
50. Van den Noortgate W, Onghena P. A multilevel meta-analysis of single-subject experimental design studies. Evid Based Commun Assess Interv. 2008;2(3):142–151.
51. Ferron JM, Farmer JL, Owens CM. Estimating individual treatment effects from multiple-baseline data: a Monte Carlo study of multilevel-modeling approaches. Behav Res Meth. 2010;42(4):930–943.
52. Hedges LV, Pustejovsky JE, Shadish WR. A standardized mean difference effect size for single case designs. Res Synth Methods. 2012;3(3):224–239.
53. Cohen J. Statistical Power Analysis for the Behavioral Sciences. Hillsdale, NJ: Lawrence Erlbauw Associates; 1988.
54. Manolov R, Moeyaert M. How can single-case data be analyzed? Software resources, tutorial, and reflections on analysis. Behav Modif. 2017;41(2):179–228.
55. Manolov R, Moeyaert M. Recommendations for choosing single-case data analytical techniques. Behav Ther. 2017;48(1):97–114.
56. Wendt O, Miller B. Quality appraisal of single-subject experimental designs: an overview and comparison of different appraisal tools. Educ Treat Children. 2012;35(2):235–286.
57. Tate RL, Perdices M, Rosenkoetter U, et al The Single-Case Reporting guideline In BEhavioural interventions (SCRIBE) 2016 statement. J School Psychol. 2016;56:133–142.
58. Gast DL, Ledford JR. Single Case Research Methodology: Applications in Special Education and Behavioral Sciences. New York, NY: Routledge; 2014.
59. Feeney TJ, Ylvisaker M. Context-sensitive cognitive-behavioral supports for young children with TBI. J Posit Behav Interv. 2008;10(2):115–128.
60. Lin C-Y, Chang Y-M. Increase in physical activities in kindergarten children with cerebral palsy by employing MaKey-MaKey-based task systems. Res Dev Disabil. 2014;35:1963–1969.
61. Lane-Brown A, Tate R. Evaluation of an intervention for apathy after traumatic brain injury: a multiple-baseline, single-case experimental design. J Head Trauma Rehab. 2010;25(6):459–469.
62. Lieberman LJ, Dunn JM, van der Mars H, McCubbin J. Peer tutors' effects on activity levels of deaf students in inclusive elementary physical education. Adapt Phys Act Q. 2000;17(1):20–39.
63. Lundblom EEG, Woods JJ. Working in the classroom: improving idiom comprehension through classwide peer tutoring. Commun Disord Q. 2012;33(4):202–219.
64. Banda DR, Hart SL, Liu-Gitz L. Impact of training peers and children with autism on social skills during center time activities in inclusive classrooms. Res Autism Spect Dis. 2010;4(4):619–625.
    65. Oddo M, Barnett DW, Hawkins RO, Musti-Rao S. Reciprocal peer tutoring and repeated reading: Increasing practicality using student groups. Psychol Schools. 2010;47(8):842–858.
      66. Peterson-Brown S, Karich AC, Symons FJ. Examining estimates of effect using non-overlap of all pairs in multiple baseline studies of academic intervention. J Behav Educ. 2012;21(3):203–216.
      67. Chen M, Hyppa-Martin JK, Reichle JE, Symons FJ. Comparing single case design overlap-based effect size metrics from studies examining speech generating device interventions. Am J Intellect Dev Disabil. 2016;121(3):169–193.
        68. Derakhshandeh F, Nikmaram M, Hosseinabad HH, et al Speech characteristics after articulation therapy in children with cleft palate and velopharyngeal dysfunction: a single case experimental design. Int J Pediatr Otorhinolaryngol. 2016;86:104–113.
        69. Klingbeil D, Moeyaert M, Archerm C, Chimnoza TM, Zwolski SA. Examining the efficacy of peer-mediated incremental rehearsal. School Psychol Rev.2017;46:122–140.
          70. Asaro-Saddler K, Saddler B, Moeyaert M, Ellis-Robinson T. Effects of a summarizing strategy on written summaries of children with emotional and behavioral disorders. Rem Spec Educ. 2017; doi: 0.1177/0741932516669051.
            71. Ingersoll B, Wainer A. Initial efficacy of project ImPACT: a parent-mediated social communication intervention for young children with ASD. J Autism Dev Disord. 2013;43(12):2943–2952.
            72. Wade CA, Ortiz C, Gorman BS. Two-session group parent training for bedtime noncompliance in head start preschoolers. Child Fam Behav Ther. 2007;29(3):23–55.
            73. Hartman DP, Barrios BA, Wood DD. Principles of behavioral observation. In:Haynes SN, Hieby EM, eds. Comprehensive Handbook of Psychological Assessment. Behavioral Assessment. Vol 3. New York, NY: Wiley; 2004.
              74. Reichow B, Volkmar F, Cicchetti D. Development of the evaluative method for evaluating and determining evidence-based practices in autism. J Autism Dev Disord. 2008;38(7):1311–1319.
              75. Simeonsson R, Bailey D. Evaluating programme impact: levels of certainty. In: IDMRB ed. Early Intervention Studies for Young Children With Special Needs. London, England: Chapman & Hall; 1991:280–296.
                76. Schlosser RW, Sigafoos J, Belfiore P. EVIDAAC Comparative Single-Subject Experimental Design Scale (CSSEDARS). Published 2009. Accessed November 20, 2016 from http://
                  77. Logan LR, Hickman RR, Harris SR, Heriza CB. Single-subject research design: recommendations for levels of evidence and quality rating. Dev Med Child Neurol. 2008;50:99–103.
                  78. Rohatgi A. WebPlotDigitizer User Manual Version 3.4. Published 2014. Accessed from

                    n-of-1 studies; quality assessment; research design; single-subject research

                    © 2017 Academy of Neurologic Physical Therapy, APTA