Secondary Logo

Institutional members access full text with Ovid®

Using Self-reports or Claims to Assess Disease Prevalence: It’s Complicated

St. Clair, Patricia ScB; Gaudette, Étienne PhD; Zhao, Henu PhD; Tysinger, Bryan MSc; Seyedin, Roxanna MPh; Goldman, Dana P. PhD

doi: 10.1097/MLR.0000000000000753
Original Articles

Background: Two common ways of measuring disease prevalence include: (1) using self-reported disease diagnosis from survey responses; and (2) using disease-specific diagnosis codes found in administrative data. Because they do not suffer from self-report biases, claims are often assumed to be more objective. However, it is not clear that claims always produce better prevalence estimates.

Objective: Conduct an assessment of discrepancies between self-report and claims-based measures for 2 diseases in the US elderly to investigate definition, selection, and measurement error issues which may help explain divergence between claims and self-report estimates of prevalence.

Data: Self-reported data from 3 sources are included: the Health and Retirement Study, the Medicare Current Beneficiary Survey, and the National Health and Nutrition Examination Survey. Claims-based disease measurements are provided from Medicare claims linked to Health and Retirement Study and Medicare Current Beneficiary Survey participants, comprehensive claims data from a 20% random sample of Medicare enrollees, and private health insurance claims from Humana Inc.

Methods: Prevalence of diagnosed disease in the US elderly are computed and compared across sources. Two medical conditions are considered: diabetes and heart attack.

Results: Comparisons of diagnosed diabetes and heart attack prevalence show similar trends by source, but claims differ from self-reports with regard to levels. Selection into insurance plans, disease definitions, and the reference period used by algorithms are identified as sources contributing to differences.

Conclusions: Claims and self-reports both have strengths and weaknesses, which researchers need to consider when interpreting estimates of prevalence from these 2 sources.

Supplemental Digital Content is available in the text.

Schaeffer Center for Health Policy and Economics, University of Southern California Price School and School of Pharmacy, Los Angeles, CA

The authors are grateful to the National Institute on Aging for its support through the Roybal Center for Health Policy Simulation (grant no. P30AG024968). The content of this article is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.

A preliminary version of this research was presented at the 50th Annual Conference of the Canadian Economics Association on June 3, 2016.

The authors declare no conflict of interest.

Reprints: Patricia St. Clair, ScB, Schaeffer Center for Health Policy and Economics, University of Southern California Price School and School of Pharmacy, 635 Downey Way, Suite 210, Los Angeles, CA 90089. E-mail:

Copyright © 2017 Wolters Kluwer Health, Inc. All rights reserved.