Establishment of an 11-Year Cohort of 8733 Pediatric Patients Hospitalized at United States Free-standing Childrens Hospitals With De Novo Acute Lymphoblastic Leukemia From Health Care Administrative Data

Fisher, Brian T. DO, MSCE*,†,‡,§; Harris, Tracey BSc; Torp, Kari BA; Seif, Alix E. MD, MPH‡,∥; Shah, Ami BS, BA; Huang, Yuan-Shung V. MS*,†; Bailey, L. Charles MD, PhD‡,∥; Kersun, Leslie S. MD, MSCE‡,∥; Reilly, Anne F. MD, MPH‡,∥; Rheingold, Susan R. MD‡,∥; Walker, Dana MD‡,∥; Li, Yimei PhD; Aplenc, Richard MD, PhD‡,§,∥,¶

Medical Care:
doi: 10.1097/MLR.0b013e31824deff9
Applied Methods

Background: Acute lymphoblastic leukemia (ALL) accounts for almost one quarter of pediatric cancer in the United States. Despite cooperative group therapeutic trials, there remains a paucity of large cohort data on which to conduct epidemiology and comparative effectiveness research studies.

Research Design: We designed a 3-step process utilizing International Classification of Diseases-9 Clinical Modification (ICD-9) discharge diagnoses codes and chemotherapy exposure data contained in the Pediatric Health Information System administrative database to establish a cohort of children with de novo ALL. This process was validated by chart review at 1 of the pediatric centers.

Results: An ALL cohort of 8733 patients was identified with a sensitivity of 88% [95% confidence interval (CI), 83%–92%] and a positive predictive value of 93% (95% CI, 89%–96%). The 30-day all cause inpatient case fatality rate using this 3-step process was 0.80% (95% CI, 0.63%–1.01%), which was significantly different than the case fatality rate of 1.40% (95% CI, 1.23%–1.60%) when ICD-9 codes alone were used.

Conclusions: This is the first report of assembly and validation of a cohort of de novo ALL patients from a database representative of free-standing children’s hospitals across the United States. Our data demonstrate that the use of ICD-9 codes alone to establish cohorts will lead to substantial patient misclassification and result in biased outcome estimates. Systematic methods beyond the use of just ICD-9 codes must be used before analysis to establish accurate cohorts of patients with malignancy. A similar approach should be followed when establishing future cohorts from administrative data.

Author Information

*Division of Infectious Diseases

Center for Pediatric Clinical Effectiveness, The Children’s Hospital of Philadelphia

Department of Pediatrics

§Center for Clinical Epidemiology and Biostatistics, the Perelman School of Medicine at the University of Pennsylvania

Division of Oncology, The Children’s Hospital of Philadelphia

Department of Biostatistics and Epidemiology, the Perelman School of Medicine at the University of Pennsylvania, Philadelphia, PA

Grant Support: NIH RO1 CA133881-01 (Aplenc), Canuso Foundation Innovation Award (Seif).

The authors declare no conflict of interest.

Reprints: Brian T. Fisher, DO, MSCE, Division of Infectious Diseases, The Children’s Hospital of Philadelphia, 34th and Civic Center Boulevard CHOP North, Room 1515, Philadelphia, PA 19104. E-mail:

