Skip Navigation LinksHome > July 2009 - Volume 20 - Issue 4 > High-dimensional Propensity Score Adjustment in Studies of T...
Epidemiology:
doi: 10.1097/EDE.0b013e3181a663cc
Methods: Original Article

High-dimensional Propensity Score Adjustment in Studies of Treatment Effects Using Health Care Claims Data

Schneeweiss, Sebastian; Rassen, Jeremy A.; Glynn, Robert J.; Avorn, Jerry; Mogun, Helen; Brookhart, M Alan

Free Access
Article Outline
Collapse Box

Author Information

From the Division of Pharmacoepidemiology and Pharmacoeconomics, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA.

Submitted 27 May 2008; accepted 23 September 2008.

Supported by the National Institute of Mental Health grant (RO1-MH078708); and the National Institute on Aging grant (RO1-AG021950, RO1-AG023178, RO1-AG018833, and K25-AG027400).

Supplemental digital content is available through direct URL citations in the HTML and PDF versions of this article (www.epidem.com).

Editors' note: A commentary on this article appears on page 521.

Correspondence: Sebastian Schneeweiss, Division of Pharmacoepidemiology and Pharmacoeconomics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, 1620 Tremont St. (suite 3030), Boston, MA 021205. E-mail: schneeweiss@post.harvard.edu.

Collapse Box

Abstract

Background: Adjusting for large numbers of covariates ascertained from patients’ health care claims data may improve control of confounding, as these variables may collectively be proxies for unobserved factors. Here, we develop and test an algorithm that empirically identifies candidate covariates, prioritizes covariates, and integrates them into a propensity-score-based confounder adjustment model.

Methods: We developed a multistep algorithm to implement high-dimensional proxy adjustment in claims data. Steps include (1) identifying data dimensions, eg, diagnoses, procedures, and medications; (2) empirically identifying candidate covariates; (3) assessing recurrence of codes; (4) prioritizing covariates; (5) selecting covariates for adjustment; (6) estimating the exposure propensity score; and (7) estimating an outcome model. This algorithm was tested in Medicare claims data, including a study on the effect of Cox-2 inhibitors on reduced gastric toxicity compared with nonselective nonsteroidal anti-inflammatory drugs (NSAIDs).

Results: In a population of 49,653 new users of Cox-2 inhibitors or nonselective NSAIDs, a crude relative risk (RR) for upper GI toxicity (RR = 1.09 [95% confidence interval = 0.91–1.30]) was initially observed. Adjusting for 15 predefined covariates resulted in a possible gastroprotective effect (0.94 [0.78–1.12]). A gastroprotective effect became stronger when adjusting for an additional 500 algorithm-derived covariates (0.88 [0.73–1.06]). Results of a study on the effect of statin on reduced mortality were similar. Using the algorithm adjustment confirmed a null finding between influenza vaccination and hip fracture (1.02 [0.85–1.21]).

Conclusions: In typical pharmacoepidemiologic studies, the proposed high-dimensional propensity score resulted in improved effect estimates compared with adjustment limited to predefined covariates, when benchmarked against results expected from randomized trials.

Large health care utilization databases are frequently used to estimate the causal effect of prescription drugs on health outcomes.1 Health care utilization data reflect routine practice, are large enough to study rare drug effects, and avoid the delays common in the collection of primary data.2,3 Despite their importance, studies of pharmacoepidemiologic claims data have been criticized for the incompleteness of information on potential confounders such as markers of clinical disease severity, laboratory results, functional status, body mass index, smoking status, and over-the-counter medication use. Such factors may lead to selective prescribing, which may in turn result in biased estimates of the association between drugs and health outcomes.4 Longitudinal claims data contain information about patient health status and confounding beyond what is normally used in pharmacoepidemiologic research. We have explored the utility of this additional information with the help of a computer algorithm examining all health care claims data.

Back to Top | Article Outline

PROXY ADJUSTMENT

Longitudinal health care claims data can be understood and analyzed as a set of proxies that indirectly describe the health status of patients. This status is presented through the lenses of health care providers recording their findings and interventions via coders and operating under the constraints of a specific health care system.5 Quite often, several levels of proxies, which we call chains of proxies, are involved. For example, the health state of a patient can be assessed through (a) the dispensing of a drug that was (b) prescribed by a physician who made a diagnosis in a (c) patient who came forward for medical care, and (d) presented certain symptoms (Fig. 1). Such a chain of proxies is influenced by access to care,6 severity of the condition, diagnostic ability of the physician, preference for one drug over another,7 the patient's ability to pay the medication copayment,8 and the accurate recording of the dispensed medication. Here, the chain of proxies leads to a reasonable interpretation that the patient indeed had a condition that troubled the patient enough to see the physician, and that was severe enough for the physician to treat and for the patient to pay a copayment for the medication. Medical evidence and treatment options were weighed in several steps along the way. These are not observable in claims data, but collectively they resulted in a measurable action.

Figure 1
Figure 1
Image Tools

The measured action in this case had a clear interpretation—the prescribed medication addressed a specific condition—but such interpretations are not always possible. In fact, we cannot determine the exact interpretation in most cases but, at the same time, an exact interpretation is not required for effective confounder adjustment. For example, old age serves a proxy for comorbidity, frailty, cognitive decline, and many other factors. Adjusting for a perfect surrogate of an unmeasured factor is equivalent to adjusting for the factor itself.9 The degree to which a surrogate is related to an unobserved or imperfectly observed confounder is proportional to the degree to which adjustment can be achieved.10,11

If we could measure a battery of proxies, we would increase the likelihood that in combination they are a good overall proxy for relevant unobserved confounding factors. Using a large number of proxy covariates for propensity score estimation and then estimating the average causal treatment effect conditional on deciles of the propensity score may result in improved control for confounding in epidemiologic studies of treatment effects using claims data compared with models that have fewer covariates. Some authors have explored the use of very large propensity score models in nonrandomized assessment of treatment effects.12,13 In some studies, large propensity score models resulted in better control of confounding than estimating the propensity score with fewer covariate information.14,16 A major challenge remains, however, to identify a very large pool of potential covariates that can be implemented in claims data, and then to identify which are influential enough in the treatment/disease relationship to include in an analysis.

We propose an algorithm that identifies a large number of covariates in claims databases, eliminates covariates with very low prevalence and minimal potential for causing bias, and then uses propensity score techniques to adjust for a large number of target covariates. The approach will be illustrated using 2 pharmacoepidemiologic studies of intended treatment effects in elderly patients, including (a) statin use and reduced risk of death and use of selective Cox-2 inhibitors and reduced risk of GI complications, and (b) influenza vaccination and hip fractures with an expected null association.

Back to Top | Article Outline

METHODS

We describe a generic algorithm that identifies a large number of target covariates in claims databases and selects covariates for propensity score adjustment to minimize residual confounding. We present 7 steps to achieve high-dimensional propensity score adjustment using health care claims databases. These steps assume that the cohort, exposure, and outcome have already been defined.

Back to Top | Article Outline
1. Specify Data Sources.

A wide range of databases of health care utilization data (claims) is available for use in pharmacoepidemiology.3 Each database is arranged in specific ways using a variety of classifications to code diagnoses (eg, International Classification of Diseases [ICD]-8 through ICD-10), procedures (eg, Current Procedural Terminology, Canadian Classification of Procedures, ICD-9-Clinical Modification), or medications (eg, National Drug Codes, American Hospital Formulary Services, Anatomic Therapeutic Chemical Classification). Beyond these basic data dimensions and coding systems, many more data dimensions can be found in such databases. Some databases provide additional dimensions such as laboratory results, other electronic medical record information, and accident registries.

We propose an algorithm that is independent of the specific data source as long as the source's data dimensions can be identified. In Figure 2, we provide a flow diagram using a typical example of data dimensions available in US Medicare claims data linked to medication use data. First, a temporal window must be defined in which baseline covariates will be identified. A frequent choice is 6 or 12 months preceding the initiation of the study or comparison drug.2 The recording of diagnoses and procedures is correlated with the frequency of health care encounters. Therefore, longer baseline periods increase the number of encounters and therefore yield more covariate information.2

Figure 2
Figure 2
Image Tools

The most basic patient information always available to typical databases is age, sex, and calendar time. We assume that given their ubiquity, these demographic covariates will always be adjusted for.

Additional covariates can then be identified from the various data dimensions, but it is first necessary to identify variables that should not be part of covariate adjustment. Although it is generally recommended to include many covariates in a propensity score regression model, in specific cases researchers may exclude variables from covariate adjustment.17 Surrogates for the exposure that are strong correlates of the study exposure but not associated with the outcome will not only increase standard errors but may also increase bias—and should, therefore, not be included in propensity score analyses.18,19 Bias can also occur through the inclusion of so-called “collider” variables, although this bias is generally thought to be weak.20,21 In our example study comparing statin initiation with glaucoma drug initiation, diagnostic codes for glaucoma should not be included in a propensity score because of their close correlation with treatment choice but not with the outcome other than through treatment.22 At this stage of the procedure, such codes can be identified and removed from the dimension data input to the algorithm. We have developed a screening tool for such covariates as part of the algorithm that will help investigators identify and remove such covariates (eAppendix 1, http://links.lww.com/A1043).

Back to Top | Article Outline
2. Identify Candidate Empirical Covariates.

Within each of p data dimensions (eg, outpatient diagnostic ICD codes, inpatient procedure codes, and drugs dispensed) codes were sorted by their prevalence. Prevalence was measured as the proportion of patients having a specific code at least once during a 6-month baseline period. Since the prevalence of a binary factor is symmetrical around 0.5, we subtracted all prevalence estimates larger than 0.5 from 1.0. The top n most prevalent codes were identified as candidate empirical covariates. If fewer than 100 patients were identified with a covariate, the covariate was dropped.

The prevalence of each code (and therefore its empirical ranking) depends on the granularity of the coding; ICD-9 codes are hierarchical such that each additional digit provides more detail of the diagnosis. Considering the fourth or fifth digit of the ICD-9 code will reduce the prevalence of the code in the data but may be a better proxy for the underlying confounder. We initially set the granularity to 3 digits for ICD-9 data for this illustration, since every system using ICD-9 records at least 3 digits. Granularity decisions need to be considered for all data dimensions, including medication coding. Depending on the application, therapeutic class may be sufficiently detailed and in other settings individual drugs or even drug dose or preparations may be required. We chose the individual drug level for the base-case algorithm.

Back to Top | Article Outline
3. Assess Recurrence.

For the top n most prevalent codes in each data dimension, we assessed how frequently that code was recorded for each patient during the baseline period. We divided each code into 3 binary variables: code occurred ≥1 time, ≥median number of times, and ≥75th percentile number of times. A code that appeared above the 75th percentile number of times would have a “true” value for all 3 recurrence variables. If any of the values were equal, the variable representing the higher cutpoint was dropped. For a data structure with p data dimensions, this results in up to p × n × 3 covariates.

Back to Top | Article Outline
4. Prioritize Covariates.

If we now wish to combine information from all p data dimensions to reduce the total number of covariates, we need to consider that the average prevalence of codes can be quite different among dimensions. From our experience, prevalence of procedure codes, including codes for simple office visits, have a higher prevalence than drug dispensings. Simply combining these tables and picking the top k prevalent candidate covariates would down-weight the importance of medication-dispensing in controlling for confounding. Further, Brookhart et al20 showed that including patient characteristics in the propensity score that are associated with the exposure but not the outcome will increase variance of the estimator with no improvement in confounding control, and in some situations can actually introduce confounding. We, therefore, decided to prioritize covariates across data dimensions by their potential for controlling confounding that is not conditional on exposure and other covariates. Because we are exclusively dealing with binary covariates, the confounded or apparent relative risk (ARR) is a function of the imbalance in prevalence of a binary confounding factor among exposed (PC1) and unexposed (PC0) subjects as well as the independent association between a confounder and the study outcome (RRCD)23:

Equation (Uncited)
Equation (Uncited)
Image Tools
Equation (Uncited)
Equation (Uncited)
Image Tools

The fraction on the right side of the equation is the multiplicative bias term, BiasM. We then sorted all p × n × 3 covariates by the magnitude of log (BiasM) in descending order. We chose multiplicative bias assessment because a bias term on the absolute risk scale (BiasA = RDCE × RDCD) would implicitly down-weight the association between a confounder and outcome if the outcome event rate is small but the prevalence of the exposure high, a typical occurrence in pharmacoepidemiologic cohort studies. The covariate prioritization is illustrated for binary variables since our algorithm to generate target covariates exclusively creates binary variables.

Back to Top | Article Outline
5. Select Covariates.

Once this prioritization of covariates was accomplished, we included the top k covariates from step 4, which could be as large as p × n × 3 when including all candidate covariates. Our base case settings were p = 8 and n = 200 resulting in 4800 candidate covariates. We selected the top k = 500 binary empirical covariates (about 10%) for inclusion in the propensity score modeling.

In addition to these k binary empirical covariates, we included covariates that should always be adjusted for if available, including d binary demographic covariates age, sex, race (available in Medicare data), and calendar year. In addition to these, we allowed the investigator to force l binary, categorical, or numeric predefined covariates into the propensity score model, based on context knowledge regarding the specific study question.

In a subanalysis, we explored the impact of adjusting for 2-way interaction terms. Of the k empirical covariates, we selected the 20 highest priority empirical covariates and computed multiplicative 2-way interactions among those covariates and with the demographic and predefined covariates, resulting in another (20 + d + l) × (20 + d + l)/2 covariates.

Back to Top | Article Outline
6. Estimate Exposure Propensity Score.

Using multivariate logistic regression, a propensity score was estimated for each subject as the predicted probability of exposure conditional on all d + l + k covariates.

Back to Top | Article Outline
7. Estimate Propensity Score-Adjusted Outcome Models.

We grouped subjects into propensity score deciles and used multivariate logistic regression analyses to model the study outcome as a function of exposure and indicator terms for decile of propensity score. In addition to an adjusted estimate, we computed a standardized mobility ratio (SMR)-weighted estimate using weights of 1 for subjects in the study drug group and the odds of the propensity score (PS) for members of the comparison group (PS/[1– PS]). SMR-weighted estimates provide treatment effect estimates among the treated. As the output of Step 6 includes each subject's propensity score, other ways to use propensity scores in the outcome estimation may be applied, including matching, inverse probability of treatment weighting, or modeling the propensity score as continuous variable.24 The high dimensional propensity score algorithm is implemented as a SAS macro available at http://www.drugepi.org.

Back to Top | Article Outline
Example Data Sources and Study Cohorts

All 3 study cohorts were drawn from a population of patients aged 65 years and older enrolled in both Medicare and the Pennsylvania Pharmaceutical Assistance Contract for the Elderly (PACE) programs between 1995 and 2002. PACE is a state pharmaceutical benefits program with incomes below $14,000 for individuals and below $17,200 for couples; its data have been frequently used for pharmacoepidemiologic studies.7,25 All prescription drugs commercially available in the United States during the study period were fully covered by PACE, requiring a nominal copayment of $6. Prescription drug information was assessed based on pharmacy claims from PACE with detailed and highly accurate information26,27 on drug name, dosage, quantity, and date of dispensing.

Back to Top | Article Outline
Study Exposures and Outcomes
Back to Top | Article Outline
Example Cohort 1.

Initiation of nonselective NSAID use versus selective Cox-2 inhibitor use was defined if an eligible beneficiary filled at least one prescription for an NSAID between 1 January 1999 and 31 December 2002 but did not use any NSAID during the 18 months prior to the index date. The index date was the first date an NSAID prescription was filled.28 The follow-up period included the 180 days after the initiation of therapy.

The study outcome of severe gastrointestinal (GI) complication was defined as either a hospitalization for GI hemorrhage or peptic ulcer disease complications including perforation (coded as ICD-9 discharge diagnoses 531×, 532×, 533×, 534×, 535×, or 578× in the first or second position or a physician service code for GI hemorrhage). These definitions were validated in 1762 patients in a hospital discharge database, with a composite positive predictive value (PPV) of 90%.29 We expected to find a moderate protective effect of Cox-2 inhibitors on GI complications,30–32 which may be concealed by confounding.33

Back to Top | Article Outline
Example Cohort 2.

The initial exposure status of statin use, nonuse, or comparator drug use as determined from pharmacy claims was carried forward until censoring after 1 year or death, whichever came first. We analyzed the extent to which patients classified as nonusers started statins during follow-up and how many statin users discontinued use, using a gap of 90 or more days in addition to the dispensed supply without statin use as the definition for statin discontinuation.

We then used Medicare claims data to ascertain time to death. Death information from Medicare records is routinely cross-checked with Social Security data. Subjects were censored at the end of 365 days after drug initiation or disenrollment from the pharmacy assistance program. We expected to find a moderate protective effect of statins on mortality in older adults (RR about 0.85)34 that may be exaggerated by confounding.35

Back to Top | Article Outline
Example Cohort 3.

For the previous examples, we hypothesized protective effects of the drug therapy with confounding going either toward the null (example 1) or away from the null (example 2). We added a third example with a strong prior hypothesis of a null association, based on context knowledge. This is the relationship between influenza vaccination in elderly people and the risk of hip fracture. A typical pre-flu season (1 October–31 December 1996) was selected to assess the exposure to influenza vaccine, and the next 4 months (January–April 1997) were the follow-up period. Patients with prior hip fractures of bisphosphonate use for the treatment of osteoporosis were excluded. Patients were censored after the occurrence of the study end point, death, or disenrollment.

Back to Top | Article Outline
Overall Analytic Strategy

To make the comparisons among models that contained a varying number of covariates as fair as possible, we used the available covariates for propensity score estimation and then adjusted the respective outcome models (logistic regression for examples 1 and 2 and Cox proportional hazard regression for example 3) for deciles of the estimated propensity score. We report the numbers of covariates entered into each propensity score model as well as its c-statistic of model discrimination.

Back to Top | Article Outline

RESULTS

Population characteristics of the 2 sample cohort studies are presented in Tables 1 and 2. There were 32,042 subjects that initiated selective Cox-2 inhibitors. The 17,611 nonselective NSAID initiators were older and had more comorbidities and more risk factors for GI complications. The NSAID initiators had 185 GI complication in 180 days (1.1%), and the Cox-2 initiators had 367 events (1.2%). Compared with 14,889 glaucoma drug initiators, the 21,233 initiators of statin therapy were younger, more likely to have cardiovascular risk factors, have more health care utilization, and about equal numbers of comorbidities. The statin initiators had 784 deaths in 1 year (3.7%); the glaucoma drug initiators had 955 deaths (6.4%).

Table 1
Table 1
Image Tools
Table 2
Table 2
Image Tools

In a traditional multivariable analysis comparing Cox-2 inhibitors and NSAIDs (Table 3), we observed no association with GI complications (RR = 0.94; 95% confidence interval [CI]: 0.78–1.12), which was slightly reduced from an unadjusted RR of 1.09, suggesting that additional adjustment for residual confounding would move the relative-risk further towards a protective effect.

Table 3
Table 3
Image Tools

Considering statin initiators versus nonstatin using initiators of glaucoma drugs (Table 4), we observed a strongly reduced risk of 1-year mortality (RR = 0.80; 0.70–0.90), which is closer to the expected results from RCTs in elderly people then an unadjusted analysis (RR = 0.56), suggesting that additional adjustment for residual confounding would move the relative risk further toward the null. In the first 2 example studies, we observed several trends regarding the performance of the high-dimensional propensity score algorithms:

Table 4
Table 4
Image Tools

1. Adding the high-dimensional propensity scores to the predefined covariates moved the point estimate in the expected direction (Cox-2 inhibitors toward a more protective effect, statins toward a less protective effect), consistent with RCT findings.

2. Results of the high-dimensional propensity score alone were identical to the second decimal place compared with the high-dimensional propensity score combined plus predefined covariates (Models 5 and 5a in Tables 3 and 4). SMR-weighted propensity score outcome models resulted in effect estimates of 0.77 (0.67–0.88) for Cox-2 inhibitors and GI complication, and 0.64 (0.59–0.70) for statins and death.

3. First stage c-statistics quantifying the degree of exposure prediction did not consistently correlate with the changes in effect estimates.

4. Including 500 rather than 200 covariates in the high-dimensional PS appeared to move the effect estimate little.

5. More finely granulated diagnostic codes (4 digit ICD-9 vs. 3 digit ICD-9) appeared to move the effect estimate little.

6. Dropping the recurrence assessment appeared to move effect estimates slightly away from the expected direction.

7. Including 2-way interactions appeared to leave the effect estimate unchanged or move it slightly away from the expected direction.

8. Selecting covariates based only on their prevalence without any further covariate prioritization moved effect estimates slightly away from the expected direction in the Cox-2 inhibitor example but not in the statin example.

Bootstrapped 95% confidence intervals based on 1000 samples were very similar to the base case algorithm (Model 5) in both example studies (0.73–1.06 in Table 3 and 0.76–0.98 in Table 4).

For the third example study, on the relationship between influenza vaccination in seniors and hip fractures, we expected a null association with strong contextual support of that null hypothesis. In a cohort of 147,583 patients, 42% received influenza vaccination and we observed 710 hip fractures (0.5%). Due to confounding we observed a slightly protective effect in an unadjusted analysis (RR = 0.93; 0.80–1.08). After adjustment for demographic factors and the high-dimensional propensity score, the effect was entirely explained (RR = 1.02; 0.85–1.21).

Back to Top | Article Outline

DISCUSSION

We hypothesized that high-dimensional proxy adjustment based on propensity score techniques could reduce residual confounding in claims databases of treatment effects. To explore this hypothesis, we developed a generic algorithm that identifies a large number of target covariates and selects covariates for propensity score adjustment to facilitate high-dimensional propensity score adjustment. In the example studies of drug-outcome relationships, we found that the application of the high-dimensional propensity score algorithm produced results closer to the expected findings based on randomized trials, compared with propensity score adjustment that uses a more limited number of investigator predefined covariates. We further found that some components of the algorithm were more important than others in our example studies. The covariate prioritization as well as the assessment and adjustment of recurrent coding of health services seem to be important contributors to the algorithm in our examples.

The algorithm's main strength rests on the exploitation of information that is usually untapped in epidemiologic analyses of health care utilization databases. It also includes a variable selection component to limit the number of adjusted covariates to an arbitrary number (500 in our base case) because in theory a number of covariates larger than the number of study subjects could be generated. Hirano and Imbens16 have developed a propensity score variable selection algorithm that is based on comparing the t-statistics of the entire propensity score regression model (tprop) with those of individual covariates (tk). Such test-based approaches to variable selection have been criticized for their dependency on study size and potential for bias.36 This approach also seems impractical if the number of candidate covariates is very large (eg, the 4800 in our example), which can provide a challenge to fitting the entire PS regression model even in large datasets.

Brookhart et al20 found that the inclusion of variables that predict only exposure but not outcome can result in larger standard errors in small studies; if residual confounding exists, this can increase bias.37,38 These effects were not evident here, probably because of the large sample size and the modest associations between individual covariates and exposure. However, the possibility that an empirically generated variable could increase bias and variance represents the primary concern of the algorithm. Users of this methodology should remove covariates that are a priori expected to be strong predictors of exposure but not likely to be related to the outcome. An example of such a covariate would be the history of glaucoma as a strong predictor for glaucoma treatment (but not death) that appeared in our screening tool (eAppendix 1, http://links.lww.com/A1043) was removed in our second example cohort. Further research is needed to consider ways in which empirically generated claims-based covariates could generate collider bias, and how they could be identified and removed.

Variable selection techniques may result in falsely narrow standard errors.39 We therefore applied bootstrapping to estimate 95% confidence intervals40 and found very similar confidence intervals compared with the simple logistic regression of the base case algorithm. This is not surprising as we did not apply any confounder selection algorithms that resulted in multiple tests of exposure-outcome associations, including change-in-estimate, forward or backward selection.36 Instead, we did a preliminary screen of candidate confounders by estimating unconditional associations of individual potential confounders with the exposure and then separately with the outcome. We then ranked covariates with regard to their potential for being a confounder and included these candidate covariates up to a predefined maximum number. We observed fairly weak unconditional associations of individual factors.

The present study is an empirical comparison of methods without a true gold standard. We used randomized trial findings to set specific expectations regarding the treatment effect estimates, but it ultimately remains unanswerable whether our high-dimensional propensity score algorithm reached that goal fully. Simulation studies are unlikely to clarify the performance of this algorithm because it is inherently empirical and relies on data-generating mechanisms that will vary from study to study, and thus are difficult to prespecify. The strength of the high-dimensional propensity score is rather that it does not make any assumptions about data quality, quantity, and interpretation. Further validation of the algorithm is possible by replicating findings that are expected based on randomized trial findings, including our example of a null association in which the high-dimensional propensity score algorithm eliminated all confounding. A specific point of concern is the performance of a covariate prioritization strategy that considers the association of each factor with the study outcome if outcomes are rare. At some point the prioritization rule may miss potentially important confounders by chance. While this is a theoretically important point it needs to be seen in light of the fact that the proposed high-dimensional proxy adjustment will be used in addition to adjustment for factors specified by the investigator.

Further work may provide improved ways of selecting covariates or using the covariates in the analysis. For example, optimization of the variable selection algorithm may be possible by considering the association between the candidate covariate and the exposure, conditional on either a set of predefined covariates or the entire list of selected covariates. It is also possible that algorithms based on cross-validation could prove to be useful for covariate selection.41,42 Finally, we have considered the use of selected covariates in an analysis that depends on correct specification of the propensity score model. Doubly robust approaches are based on an assumed model of both the exposure and outcome, but are consistent if only one of the models is correctly specified.18 The use of the selected covariates in the setting of a doubly robust estimator may improve the performance of the algorithm. However, since the outcome model must be more parsimonious, a separate list of covariates would need to be generated for the outcome model—perhaps those with particularly strong outcome associations.

It is too early to conclude that the proposed algorithm or variations thereof will be able to substitute for existing confounder adjustment strategies in claims data analyses, although in our limited examples the algorithm performed better than standard techniques. Practical advantages are that the algorithm can be run efficiently on a large scale, it reduces investigator and programming time substantially, and it reduces programming errors and potential mischaracterization of covariate definitions or adjustments without a loss of validity. This last point might be of particular practical advantage in studies pooling multiple claims databases.

In conclusion, in some typical pharmacoepidemiologic studies of treatment effects, the proposed proxy adjustment via high-dimensional propensity scores generated effect estimates closer to randomized trial findings, compared with standard covariate adjustment of predefined covariates. Further replication will be necessary in a variety of settings to assess the value of this approach.

Back to Top | Article Outline

REFERENCES

1.Arana A, Rivero E, Egberts TC. What do we show and who does so? An analysis of the abstracts presented at the 19th ICPE. Pharmacoepidemiol Drug Saf. 2004;13:S330–S331.

2.Schneeweiss S, Avorn J. Using health care utilization databases for epidemiologic research on therapeutics. J Clin Epidemiol. 2005;58:323–337.

3.Strom BL, Carson JL. Use of automated databases for pharmacoepidemiology research. Epidemiol Rev. 1990;12:87–107.

4.Walker AM. Confounding by indication. Epidemiology. 1996;7:335–336.

5.Schneeweiss S. Understanding secondary databases: a commentary on “Sources of bias for health state characteristics in secondary databases.” J Clin Epidemiol. 2007;60:648–650.

6.Anderson RM. Revisiting the behavioral model and access to medical care: Does it matter? J Health Soc Behav. 1995;36:1–10.

7.Schneeweiss S, Glynn RJ, Avorn J, Solomon DH. A Medicare database review found that physician preferences increasingly outweighed patient characteristics as determinants of first-time prescriptions for COX-2 inhibitors. J Clin Epidemiol. 2005;58:98–102.

8.Roblin DW, Platt R, Goodman MJ, et al. Effect of increased cost-sharing on oral hypoglycemic use in five managed care organizations: how much is too much? Med Care. 2005;43:951–959.

9.Wooldridge JM. Econometric Analysis of Cross Section and Panel Data. Cambridge, MA: MIT Press; 2001.

10.Greenland S. The effect of misclassification in the presence of covariates. Am J Epidemiol. 1980;112:564–569.

11.Greenland S, Robins JM. Confounding and misclassification. Am J Epidemiol. 1985;122:495–506.

12.Seeger JD, Williams PL, Walker AM. An application of propensity score matching using claims data. Pharmacoepidemiol Drug Saf. 2005;14:465–476.

13.McAfee AT, Ming EE, Seeger JD, et al. The comparative safety of rosuvastatin: a retrospective matched cohort study in over 48,000 initiators of statin therapy. Pharmacoepidemiol Drug Saf. 2006;15:444–453.

14.Seeger JD, Kurth T, Walker AM. Use of propensity score technique to account for exposure-related covariates: an example and lesson. Med Care. 2007;45(suppl 2):S143–S148.

15.Deleted in proof.

16.Hirano K, Imbens GW. Estimation of causal effects using propensity score weighting: An application to data on right heart catheterization. Health Serv Outcomes Res Methodol. 2001;2:259–278.

17.Judkins DR, Morganstein D, Zador P, Piesse A, Barrett B, Mukhopadhyay P. Variable selection and ranking in propensity scoring. Stat Med. 2007;26:1022–1033.

18.Van der Laan M, Robins JM. Unified Methods for Censored Longitudinal Data and Causality. New York: Springer; 2003.

19.Robins JM, Mark SD, Newey WK. Estimating exposure effects by modelling the expectation of exposure conditional on confounders. Biometrics. 1992;48:479–495.

20.Brookhart MA, Schneeweiss S, Rothman K, Glynn RJ, Avorn J, Sturmer T. Variable selection in propensity score models. Am J Epidemiol. 2006;163:1149–1156.

21.Greenland S. Quantifying biases in causal models: classical confounding vs collider-stratification bias. Epidemiology. 2003;14:300–306.

22.Schneeweiss S, Patrick AR, Sturmer T, et al. Increasing levels of restriction in pharmacoepidemiologic database studies of elderly and comparison with randomized trial results. Med Care. 2007;45:S131–S142.

23.Bross ID. Spurious effects from an extraneous variable. J Chronic Dis. 1966;19:637–647.

24.Stürmer T, Joshi M, Glynn RJ, Avorn J, Rothman KJ, Schneeweiss S. A review of the application of propensity score methods yielded increasing use, advantages in specific settings, but not substantially different estimates compared with conventional multivariable methods. J Clin Epidemiol. 2006;59:437–447.

25.Solomon DH, Schneeweiss S, Glynn RJ, et al. The relationship between selective COX-2 inhibitors and acute myocardial infarction. Circulation. 2004;109:2068–2073.

26.McKenzie DA, Semradek J, McFarland BH, Mullooly JP, McCamant LE. The validity of Medicaid pharmacy claims for estimating drug use among elderly nursing home residents: The Oregon experience. J Clin Epidemiol. 2000;53:1248–1257.

27.West S, Savitz DA, Koch G, Strom BL, Guess HA, Hartzema A. Recall accuracy for prescription medications: self report compared with database information. Am J Epidemiol. 1995;142:1103–1112.

28.Ray W. Evaluating medication effects outside of clinical trials: new-user designs. Am J Epidemiol. 2003;158:915–920.

29.Raiford DS, Perez Gutthann S, Garcia Rodriguez LA. Positive predictive value of ICD-9 codes in the identification of cases of complicated peptic ulcer disease in the Saskatchewan hospital automated database. Epidemiology. 1996;7:101–104.

30.Moore RA, Derry S, Makinson GT, McQuay HJ. Tolerability and adverse events in clinical trials of celecoxib in osteoarthritis and rheumatoid arthritis: systematic review and meta-analysis of information from company clinical trial reports. Arthritis Res Ther. 2005;7:R644–R665.

31.Watson DJ, Harper SE, Zhao PL, Quan H, Bolognese JA, Simon TJ. Gastrointestinal tolerability of the selective cyclooxygenase-2 (COX-2) inhibitor rofecoxib compared with nonselective COX-1 and COX-2 inhibitors in osteoarthritis. Arch Intern Med. 2000;160:2998–3003.

32.Eisen GM, Goldstein JL, Hanna DB, Rublee DA. Meta-analysis: upper gastrointestinal tolerability of valdecoxib, a cyclooxygenase-2-specific inhibitor, compared with nonspecific nonsteroidal anti-inflammatory drugs among patients with osteoarthritis and rheumatoid arthritis. Aliment Pharmacol Ther. 2005;21:591–598.

33.Schneeweiss S, Solomon DH, Wang PS, Brookhart MA. Simultaneous assessment of short-term gastrointestinal benefits and cardiovascular risks of selective COX-2 inhibitors and non-selective NSAIDs: an instrumental variable analysis. Arthritis Rheum. 2006;54:3390–3398.

34.Roberts CG, Guallar E, Rodriguez A. Efficacy and safety of statin monotherapy in older adults: a meta-analysis. J Gerontol A Biol Sci Med Sci. 2007;62:879–887.

35.Glynn RJ, Schneeweiss S, Wang P, Levin R, Avorn J. Selective prescribing can lead to over-estimation of the benefits of lipid-lowering drugs. J Clin Epidemiol. 2006;59:819–828.

36.Greenland S. Invited commentary: Variable selection versus shrinkage in the control of multiple confounders. Am J Epidemiol. 2008;167:523–529.

37.Lefebrve G, Delaney JA, Platt RW. Impact of mis-specification of the treatment model on estimates from a marginal structural model. Stat Med. 2008;27:3629–3642.

38.Fu AZ, Li L. Thinking of having a higher predictive power for your first-stage model in propensity score analysis? Think again. Health Serv Outcomes Res Methodol. 2008;8:115–117.

39.Freedman DA, Navidi W, Peters SC. On the impact of variable selection in fitting regression equations. In: Dijlestra TK, ed. On Model Uncertainty and its Statistical Implications. Berlin, Germany: Springer; 1988:1–16.

40.Carpenter J, Bithell J. Bootstrap confidence intervals: when, which, and what? Stat Med. 2000;19:1141–1164.

41.van der Laan MJ, Dudoit S. Unified Cross-Validation Methodology For Selection Among Estimators and a General Crossvalidated Adaptive Epsilon-Net Estimator: Finite Sample Oracle Inequalities And Examples. Berkeley, CA: Division of Biostatistics, University of California, Berkeley; 2003. UC Berkeley Division of Biostatistics Working Paper Series, paper 130. (Available at: http://www.bepress.com/ucbbiostat/paper130).

42.Brookhart MA, van der Laan MJ. A semiparametric model selection criterion with applications to the marginal structural model. Comput Stat Data Anal. 2006;50:475–498.

Cited By:

This article has been cited 40 time(s).

British Medical Journal
Comparative safety and effectiveness of sitagliptin in patients with type 2 diabetes: retrospective population based cohort study
Eurich, DT; Simpson, S; Senthilselvan, A; Asche, CV; Sandhu-Minhas, JK; McAlister, FA
British Medical Journal, 346(): -.
ARTN f2267
CrossRef
Heart Failure Clinics
Comparative Effectiveness Research in Heart Failure Therapies Women, Elderly Patients, and Patients with Kidney Disease
Shah, RU; Chang, TI; Fonarow, GC
Heart Failure Clinics, 9(1): 79-+.
10.1016/j.hfc.2012.09.003
CrossRef
American Journal of Managed Care
Retail Clinic Utilization Associated With Lower Total Cost of Care
Sussman, A; Dunham, L; Snower, K; Hu, M; Matlin, OS; Shrank, WH; Choudhry, NK; Brennan, T
American Journal of Managed Care, 19(4): E148-E157.

Drug Safety
An Evaluation of the THIN Database in the OMOP Common Data Model for Active Drug Safety Surveillance
Zhou, XF; Murugesan, S; Bhullar, H; Liu, Q; Cai, B; Wentworth, C; Bate, A
Drug Safety, 36(2): 119-134.
10.1007/s40264-012-0009-3
CrossRef
Clinical Therapeutics
Comparative Effectiveness of Tiotropium and Ipratropium in Prevention of Hospital Readmission for COPD: A Population-Based Cohort Study
Kawasumi, Y; Paterson, MJ; Morrow, RL; Miller, TA; Bassett, K; Wright, JM; Dormuth, CR
Clinical Therapeutics, 35(4): 523-531.
10.1016/j.clinthera.2012.10.007
CrossRef
Pharmacoepidemiology and Drug Safety
Association between anti-TNF-alpha therapy and interstitial lung disease
Herrinton, LJ; Harrold, LR; Liu, LY; Raebel, MA; Taharka, A; Winthrop, KL; Solomon, DH; Curtis, JR; Lewis, JD; Saag, KG
Pharmacoepidemiology and Drug Safety, 22(4): 394-402.
10.1002/pds.3409
CrossRef
Pharmacoepidemiology and Drug Safety
Statistical visualization for assessing performance of methods for safety surveillance using electronic databases
Li, XC; Hui, S; Ryan, P; Rosenman, M; Overhage, M
Pharmacoepidemiology and Drug Safety, 22(5): 503-509.
10.1002/pds.3419
CrossRef
Chinese Journal of Integrative Medicine
Design and analysis of post-marketing research
Zhou, XH; Wei, Y
Chinese Journal of Integrative Medicine, 19(7): 488-493.
10.1007/s11655-013-1501-z
CrossRef
Bmj Open
Is thiazolidinediones use a factor in delaying the need for insulin therapy in type 2 patients with diabetes? A population-based cohort study
Carney, GA; Bassett, K; Wright, JM; Dormuth, CR
Bmj Open, 2(6): -.
ARTN e001910
CrossRef
Health Services Research
Squeezing the Balloon: Propensity Scores and Unmeasured Covariate Balance
Brooks, JM; Ohsfeldt, RL
Health Services Research, 48(4): 1487-1507.
10.1111/1475-6773.12020
CrossRef
Statistics in Medicine
Empirical assessment of methods for risk identification in healthcare data: results from the experiments of the Observational Medical Outcomes Partnership
Ryan, PB; Madigan, D; Stang, PE; Overhage, JM; Racoosin, JA; Hartzema, AG
Statistics in Medicine, 31(): 4401-4415.
10.1002/sim.5620
CrossRef
Annals of Internal Medicine
Chlorthalidone Versus Hydrochlorothiazide for the Treatment of Hypertension in Older Adults A Population-Based Cohort Study
Dhalla, IA; Gomes, T; Yao, Z; Nagge, J; Persaud, N; Hellings, C; Mamdani, MM; Juurlink, DN
Annals of Internal Medicine, 158(6): 447-U121.

Clinical Pharmacology & Therapeutics
Pharmacovigilance Using Clinical Notes
LePendu, P; Iyer, SV; Bauer-Mehren, A; Harpaz, R; Mortensen, JM; Podchiyska, T; Ferris, TA; Shah, NH
Clinical Pharmacology & Therapeutics, 93(6): 547-555.
10.1038/clpt.2013.47
CrossRef
Statistical Methods in Medical Research
On weighting approaches for missing data
Li, LL; Shen, CY; Li, XC; Robins, JM
Statistical Methods in Medical Research, 22(1): 14-30.
10.1177/0962280211403597
CrossRef
Statistical Methods in Medical Research
Disproportionality methods for pharmacovigilance in longitudinal observational databases
Zorych, I; Madigan, D; Ryan, P; Bate, A
Statistical Methods in Medical Research, 22(1): 39-56.
10.1177/0962280211403602
CrossRef
Statistical Methods in Medical Research
Performance of a semi-automated approach for risk estimation using a common data model for longitudinal healthcare databases
Le, HV; Beach, KJ; Powell, G; Pattishall, E; Ryan, P; Mera, RM
Statistical Methods in Medical Research, 22(1): 97-112.
10.1177/0962280211403599
CrossRef
American Journal of Cardiology
Hospitalization for Hemorrhage Among Warfarin Recipients Prescribed Amiodarone
Lam, J; Gomes, T; Juurlink, DN; Mamdani, MM; Pullenayegum, EM; Kearon, C; Spencer, FA; Paterson, M; Zheng, H; Holbrook, AM
American Journal of Cardiology, 112(3): 420-423.
10.1016/j.amjcard.2013.03.051
CrossRef
Biometrics
Model Feedback in Bayesian Propensity Score Estimation
Zigler, CM; Watts, K; Yeh, RW; Wang, Y; Coull, BA; Dominici, F
Biometrics, 69(1): 263-273.
10.1111/j.1541-0420.2012.01830.x
CrossRef
British Medical Journal
Use of high potency statins and rates of admission for acute kidney injury: multicenter, retrospective observational analysis of administrative databases
Dormuth, CR; Hemmelgarn, BR; Paterson, JM; James, MT; Teare, GF; Raymond, CB; Lafrance, JP; Levy, A; Garg, AX; Ernst, P
British Medical Journal, 346(): -.
ARTN f880
CrossRef
Journal of Clinical Epidemiology
The use of clinical trials in comparative effectiveness research on mental health
Blanco, C; Rafful, C; Olfson, M
Journal of Clinical Epidemiology, 66(8): S29-S36.
10.1016/j.jclinepi.2013.02.013
CrossRef
Journal of Clinical Epidemiology
Prognostic score-based balance measures can be a useful diagnostic for propensity score methods in comparative effectiveness research
Stuart, EA; Lee, BK; Leacy, FP
Journal of Clinical Epidemiology, 66(8): S84-S90.
10.1016/j.jclinepi.2013.01.013
CrossRef
Journal of Clinical Epidemiology
Super learning to hedge against incorrect inference from arbitrary parametric assumptions in marginal structural modeling
Neugebauer, R; Fireman, B; Roy, JA; Raebel, MA; Nichols, GA; O'Connor, PJ
Journal of Clinical Epidemiology, 66(8): S99-S109.
10.1016/j.jclinepi.2013.01.016
CrossRef
Journal of Clinical Epidemiology
Impact of immortal person-time and time scale in comparative effectiveness research for medical devices: a case for implantable cardioverter-defibrillators
Mi, XJ; Hammill, BG; Curtis, LH; Greiner, MA; Setoguchi, S
Journal of Clinical Epidemiology, 66(8): S138-S144.
10.1016/j.jclinepi.2013.01.014
CrossRef
International Journal of Epidemiology
Matched designs and causal diagrams
Mansournia, MA; Hernan, MA; Greenland, S
International Journal of Epidemiology, 42(3): 860-869.
10.1093/ije/dyt083
CrossRef
European Journal of Clinical Pharmacology
High-dimensional versus conventional propensity scores in a comparative effectiveness study of coxibs and reduced upper gastrointestinal complications
Garbe, E; Kloss, S; Suling, M; Pigeot, I; Schneeweiss, S
European Journal of Clinical Pharmacology, 69(3): 549-557.
10.1007/s00228-012-1334-2
CrossRef
American Journal of Kidney Diseases
Emerging Analytical Techniques for Comparative Effectiveness Research
Brunelli, SM; Rassen, JA
American Journal of Kidney Diseases, 61(1): 13-17.
10.1053/j.ajkd.2012.08.030
CrossRef
American Journal of Epidemiology
Evaluating the Impact of Database Heterogeneity on Observational Study Results
Madigan, D; Ryan, PB; Schuemie, M; Stang, PE; Overhage, JM; Hartzema, AG; Suchard, MA; DuMouchel, W; Berlin, JA
American Journal of Epidemiology, 178(4): 645-651.
10.1093/aje/kwt010
CrossRef
Statistics in Biopharmaceutical Research
Learning From Epidemiology: Interpreting Observational Database Studies for the Effects of Medical Products
Ryan, P; Suchard, MA; Schuemie, M; Madigan, D
Statistics in Biopharmaceutical Research, 5(3): 170-179.
10.1080/19466315.2013.791638
CrossRef
Alimentary Pharmacology & Therapeutics
Meta-analysis: the effects of proton pump inhibitors on cardiovascular events and mortality in patients receiving clopidogrel
Kwok, CS; Loke, YK
Alimentary Pharmacology & Therapeutics, 31(8): 810-823.
10.1111/j.1365-2036.2010.04247.x
CrossRef
Circulation-Cardiovascular Quality and Outcomes
Improving the Pathway From Cardiovascular Medication Prescribing to Longer-Term Adherence New Results About Old Issues
Choudhry, NK
Circulation-Cardiovascular Quality and Outcomes, 3(3): 223-225.
10.1161/CIRCOUTCOMES.110.957142
CrossRef
Jama-Journal of the American Medical Association
Anticonvulsant Medications and the Risk of Suicide, Attempted Suicide, or Violent Death
Patorno, E; Bohn, RL; Wahl, PM; Avorn, J; Patrick, AR; Liu, J; Schneeweiss, S
Jama-Journal of the American Medical Association, 303(): 1401-1409.

Archives of General Psychiatry
Variation in the Risk of Suicide Attempts and Completed Suicides by Antidepressant Agent in Adults A Propensity Score-Adjusted Analysis of 9 Years' Data
Schneeweiss, S; Patrick, AR; Solomon, DH; Mehta, J; Dormuth, C; Miller, M; Lee, JC; Wang, PS
Archives of General Psychiatry, 67(5): 497-506.

American Journal of Managed Care
Relationship Between High Cost Sharing and Adverse Outcomes: A Truism That's Tough to Prove
Choudhry, NK
American Journal of Managed Care, 16(4): 287-289.

Pediatrics
Comparative Safety of Antidepressant Agents for Children and Adolescents Regarding Suicidal Acts
Schneeweiss, S; Patrick, AR; Solomon, DH; Dormuth, CR; Miller, M; Mehta, J; Lee, JC; Wang, PS
Pediatrics, 125(5): 876-888.
10.1542/peds.2009-2317
CrossRef
International Journal of Clinical Practice
Adherence to statin therapy: the key to survival?
Andersohn, F
International Journal of Clinical Practice, 64(7): 843-847.
10.1111/j.1742-1241.2009.02302.x
CrossRef
Circulation
Cardiovascular Outcomes and Mortality in Patients Using Clopidogrel With Proton Pump Inhibitors After Percutaneous Coronary Intervention or Acute Coronary Syndrome
Rassen, JA; Choudhry, NK; Avorn, J; Schneeweiss, S
Circulation, 120(): 2322-U34.
10.1161/CIRCULATIONAHA.109.873497
CrossRef
Acm Transactions on Modeling and Computer Simulation
Massive Parallelization of Serial Inference Algorithms for a Complex Generalized Linear Model
Suchard, MA; Simpson, SE; Zorych, I; Ryan, P; Madigan, D
Acm Transactions on Modeling and Computer Simulation, 23(1): -.
ARTN 10
CrossRef
Circulation-Cardiovascular Interventions
Comparative Effectiveness of Preventative Therapy for Venous Thromboembolism After Coronary Artery Bypass Graft Surgery
Kulik, A; Rassen, JA; Myers, J; Schneeweiss, S; Gagne, J; Polinski, JM; Liu, J; Fischer, MA; Choudhry, NK
Circulation-Cardiovascular Interventions, 5(4): 590-596.
10.1161/CIRCINTERVENTIONS.112.968313
CrossRef
Medical Care
Privacy-Maintaining Propensity Score-Based Pooling of Multiple Databases Applied to a Study of Biologics
Rassen, JA; Solomon, DH; Curtis, JR; Herrinton, L; Schneeweiss, S
Medical Care, 48(6): S83-S89.
10.1097/MLR.0b013e3181d59541
PDF (510) | CrossRef
Medical Care
Confounding Control in Healthcare Database Research: Challenges and Potential Approaches
Brookhart, MA; Stürmer, T; Glynn, RJ; Rassen, J; Schneeweiss, S
Medical Care, 48(6): S114-S120.
10.1097/MLR.0b013e3181dbebe3
PDF (503) | CrossRef
Back to Top | Article Outline

Supplemental Digital Content

Back to Top | Article Outline

© 2009 Lippincott Williams & Wilkins, Inc.

Twitter  Facebook

Login

Article Tools

Images

Share