Secondary Logo

Institutional members access full text with Ovid®

Share this article on:

Leveraging Linkage of Cohort Studies With Administrative Claims Data to Identify Individuals With Cancer

Bronson, Mackenzie R., BA*; Kapadia, Nirav S., MD, MS*,†; Austin, Andrea M., PhD*; Wang, Qianfei, MS*; Feskanich, Diane, ScD; Bynum, Julie P.W., MD, MPH*; Grodstein, Francine, ScD; Tosteson, Anna N.A., ScD*,†

doi: 10.1097/MLR.0000000000000875
Online Articles: Applied Methods

Background: In an effort to overcome quality and cost constraints inherent in population-based research, diverse data sources are increasingly being combined. In this paper, we describe the performance of a Medicare claims-based incident cancer identification algorithm in comparison with observational cohort data from the Nurses’ Health Study (NHS).

Methods: NHS-Medicare linked participants’ claims data were analyzed using 4 versions of a cancer identification algorithm across 3 cancer sites (breast, colorectal, and lung). The algorithms evaluated included an update of the original Setoguchi algorithm, and 3 other versions that differed in the data used for prevalent cancer exclusions.

Results: The algorithm that yielded the highest positive predictive value (PPV) (0.52–0.82) and κ statistic (0.62–0.87) in identifying incident cancer cases utilized both Medicare claims and observational cohort data (NHS) to remove prevalent cases. The algorithm that only used NHS data to inform the removal of prevalent cancer cases performed nearly equivalently in statistical performance (PPV, 0.50–0.79; κ, 0.61–0.85), whereas the version that used only claims to inform the removal of prevalent cancer cases performed substantially worse (PPV, 0.42–0.60; κ, 0.54–0.70), in comparison with the dual data source-informed algorithm.

Conclusions: Our findings suggest claims-based algorithms identify incident cancer with variable reliability when measured against an observational cohort study reference standard. Self-reported baseline information available in cohort studies is more effective in removing prevalent cancer cases than are claims data algorithms. Use of claims-based algorithms should be tailored to the research question at hand and the nature of available observational cohort data.

*The Dartmouth Institute for Health Policy and Clinical Practice

Norris Cotton Cancer Center, Geisel School of Medicine at Dartmouth, Lebanon, NH

Department of Medicine, Channing Division of Network Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA

F.G. and A.N.A.T. both served as senior authors.

Supported by the National Cancer Institute (grant number UM1CA186107).

The authors declare no conflict of interest.

Reprints: Mackenzie R. Bronson, BA, One Medical Center Drive, WTRB Level 1, Lebanon, NH 03756. E-mail:

Copyright © 2018 Wolters Kluwer Health, Inc. All rights reserved.