Understanding propensity scores : JBI Evidence Implementation

Secondary Logo

Journal Logo


Understanding propensity scores

Fernandez, Ritin PhD1,2,3; Tufanaru, Catalin MD, MPH, MClinSci, PhD4

Author Information
International Journal of Evidence-Based Healthcare 15(4):p 142-143, December 2017. | DOI: 10.1097/XEB.0000000000000127
  • Free

The randomized controlled trial (RCT) is considered the ideal study design for quantitative research exploring causal effects of interventions on outcomes of interests.1–3 One explanation for the preference for RCTs is that in well conducted RCTs, in ideal conditions, the allocation of participants to the compared interventions is not influenced by known (observed) participants’ characteristics (observed in the data available on participants) and that the compared groups at baseline, prior to any interventions, are comparable with regards to all observed and, presumably, all relevant unobserved participants’ characteristics (unseen in the data available on participants) that may provide alternative explanations for any observed differences in effects.1–3 However, in many instances, RCTs are not possible for diverse ethical or feasibility reasons, and causal effects of interventions are explored in nonrandomized experimental studies (quasiexperimental studies) or in nonexperimental studies (observational studies).2,3 Causal effects of interventions may be properly estimated in quasiexperimental and observational studies with the condition of appropriate statistical analysis, an example of a such valid statistical approach being propensity score analysis (PSA).1–3

In simple words, PSA refers to a set of statistical methods used to compute quantitative scores known as propensity scores used for correcting for the visible (known, observed) differences between participants receiving the compared interventions (differences visible from available data on participants).1–5 A propensity score for a study participant is a summary score representing the probability of the study participant receiving a treatment of interest given the observed participant's characteristics.2,4 Propensity scores are estimated using statistical software packages such as Stata (StataCorp, College Station, Texas, USA), SAS (SAS Institute Inc., Cary, North Carolina, USA), and R (Bell Laboratories Australia) and employing diverse statistical approaches including, for example, logistic regression, classification and regression trees, and discriminant analysis.2,4 The propensity score is ideally computed based on all relevant participants’ characteristics of interest (such as age, coexisting diseases, previous treatment, etc.), summarizing in one summary overall score the information provided simultaneously by all relevant participants’ characteristics of interest.2,4 Different participants may have different individual characteristics; however, they may have similar summary propensity scores; in other words, different participants with different individual characteristics may have similar overall probability of receiving the treatment of interest considering simultaneously all the observed relevant participant's characteristics of interest.2,4 Essentially, participants from different groups, with similar propensity scores are comparable, are similar, that is, they have similar probability of receiving the treatment of interest.2,4 The propensity score is used to assess and facilitate the comparability (similarity) of study participants from different compared groups.2,4 The computed propensity scores are used in statistical analysis for grouping (matching participants with similar propensity scores or classifying participants in strata based on propensity scores) comparable (similar) participants or for adjusting or weighting.2,4,5 In the matching approach, participants from compared groups are matched based on similar propensity scores and then the statistical analyses are performed on data from matched participants; in the stratification approach, strata of participants, homogeneous strata with regards to the propensity scores, are created from compared groups, and statistical analyses are then performed on these strata.2,4,5 Propensity scores adjustments may be used in analyses of covariance; propensity scores may be used as weights in multivariate statistical analyses.2,4,5

There are many published PSA studies. For example, in a study by Normand et al.6 using observational data, exploring whether adherence to recommendations for coronary angiography more than 12 h after symptom onset but prior to hospital discharge after acute myocardial infarction resulted in better survival, propensity scores were used for creating a matched retrospective sample of patients used in the statistical analysis. In a study by Gum et al.,7 using observational data from a cohort study, examining whether the use of aspirin is associated with mortality benefits in stable patients with known or suspected coronary disease, propensity scores were used for matching study participants for adjustment for selection bias and confounding. In a study by Ahmed et al.,8 using retrospective analysis of observational data, examining the effects of diuretics on heart failure outcomes, propensity score matching was used for matching study participants, and propensity scores were used for adjustments in statistical data analysis, in regression analysis.

In summary, even if the RCT is considered the ideal study design for quantitative research exploring causal effects of interventions on outcomes of interests, whenever RCTs are not possible for diverse ethical or feasibility reasons, causal effects of interventions may be properly estimated in quasiexperimental and observational studies applying appropriate statistical analysis such as PSA.


Declarations: This article has not been published elsewhere, nor is it currently under submission elsewhere.

Conflicts of interest

The authors report no conflicts of interest.


1. Rosenbaum PR. Observational studies. 2nd ed.New York: Springer-Verlag; 2002.
2. Guo S, Fraser MW. Propensity scores analysis: statistical methods and applications, 2nd ed., volume 11. Advanced quantitative techniques in the social sciences series. Los Angeles: SAGE; 2015.
3. Shadish WR, Cook TD, Campbell DT. Experimental and quasi-experimental design for generalized causal inference. Boston: Houghton-Mifflin; 2002.
4. Pan W, Bai H. Guilford, Propensity score analysis: fundamentals and developments. New York: 2015.
5. Stuart EA, Rubin DB. Osborne JW. Best practices in quasi-experimental designs: matching methods for causal inference. SAGE, Best practices in quantitative methods. Los Angeles: 2008.
6. Normand ST, Landrum MB, Guadagnoli E, Ayanian JZ, Ryan TJ, Cleary PD, McNeil BJ. Validating recommendations for coronary angiography following acute myocardial infarction in the elderly: a matched analysis using propensity scores. J Clin Epidemiol 2001; 54:387–398.
7. Gum PA, Thamilarasan M, Watanabe J, Blackstone EH, Lauer MS. Aspirin use and all-cause mortality among patients being evaluated for known or suspected coronary artery disease: a propensity analysis. JAMA 2001; 286:1187–1194.
8. Ahmed A, Husain A, Love TE, et al. Heart failure, chronic diuretic use, and increase in mortality and hospitalization: an observational study using propensity score methods. Eur Heart J 2006; 27:1431–1439.
International Journal of Evidence-Based Healthcare © 2017 The Joanna Briggs Institute

A video commentary on implementation project titled: How do health professionals prioritise clinical areas for implementation of evidence into practice? The commentary is provided by Andrea Rochon RN, MNSc, Research Assistant, Queen's University, Ontario, Canada