Share this article on:

Privacy-Maintaining Propensity Score-Based Pooling of Multiple Databases Applied to a Study of Biologics

Rassen, Jeremy A. ScD*; Solomon, Daniel H. MD, MPH*†; Curtis, Jeffrey R. MD, MPH; Herrinton, Lisa PhD§; Schneeweiss, Sebastian MD, ScD*

doi: 10.1097/MLR.0b013e3181d59541
Comparative Effectiveness

Introduction: A large study on the safety of biologics required pooling of data from multiple data sources, but while extensive confounder adjustment was necessary, private, individual-level covariate information could not be shared.

Objectives: To describe the methods of pooling data that investigators considered, and to detail the strengths and limitations of the chosen method: a propensity score (PS)-based approach that allowed for full multivariate adjustment without compromising patient privacy.

Research Design: The project had a central data coordinating center responsible for collection and analysis of data. Private data could not be transmitted to the data coordinating center. Investigators assessed 4 methods for pooled analyses: full covariate sharing, cell-aggregated sharing, meta-analysis, and the PS-based method. We evaluated each method for protection of private information, analytic integrity and flexibility, and ability to meet the study's operational and statistical needs.

Results: Analysis of 4 example datasets yielded substantially similar estimates if data were pooled with a PS versus individual covariates (0%–3% difference in point estimates). Several practical challenges arose. (1) PSs are best suited for dichotomous exposures but 6 or more exposure categories were desired; we chose a series of exposure contrasts with a common referent group. (2) Subgroup analyses had to be specified a priori. (3) Time-varying exposures and confounders required appropriate analytic handling including re-estimation of PSs. (4) Detection of heterogeneity among centers was necessary.

Conclusions: The PS-based pooling method offered strong protection of patient privacy and a reasonable balance between analytic integrity and flexibility of study execution. We would recommend its use in other studies that require pooling of databases, multivariate adjustment, and privacy protection.

From the Divisions of *Pharmacoepidemiology and Pharmacoeconomics, and †Rheumatology, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA; ‡Division of Clinical Immunology and Rheumatology, Department of Medicine, University of Alabama at Birmingham, Birmingham, AL; and §Division of Research, Kaiser Permanente, Oakland, CA.

Supported by the Agency for Healthcare Research and Quality (AHRQ) contract 1 U18 HSO17919. Dr. Rassen is a recipient of a career development award from AHRQ (1 K01 HS018088). Dr. Schneeweiss is PI of the Brigham and Women's Hospital AHRQ-funded DEcIDE Center on comparative effectiveness research.

Reprints: Jeremy A. Rassen, ScD, Division of Pharmacoepidemiology and Pharmacoeconomics, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, 1620 Tremont Street, Suite 3030, Boston, MA 02120. E-mail:

© 2010 Lippincott Williams & Wilkins, Inc.