Share this article on:

Estimating Rare Disease Prevalence From Administrative Hospitalization Databases

Ward, Michael M.

doi: 10.1097/01.ede.0000153643.88019.92

NIAMS/NIH; Bethesda, MD;

Back to Top | Article Outline

To the Editor:

Prevalence estimates for many uncommon chronic diseases are unavailable. It may be possible to use administrative hospitalization data to estimate these prevalences. Most states mandate that hospitals report discharge abstracts for each hospitalization. These data provide the number of persons with a disease hospitalized in that state. This number can in turn be divided by the proportion of persons with the disease hospitalized per year to estimate the number of persons with the disease.

Hospitalization data from California were used to test this method for 5 uncommon chronic diseases that have an estimated prevalence of 1/1000 or less (hemophilia A, multiple sclerosis, Crohn disease, cystic fibrosis, and scleroderma). I chose these diseases because data on the frequency of hospitalizations were available, and because existing prevalence estimates could be used as reference standards. The prevalence of hemophilia A was from an active surveillance study in 6 states,1 and hospitalization frequency (19%) was based on 1998 data from this cohort (JM Soucie, personal communication). Prevalence estimates of multiple sclerosis are heterogeneous, ranging from 86 to 177 per 100,000.2–5 Hospitalization frequency (25%) was based on a survey of 606 members of the National Multiple Sclerosis Society.6 The prevalence of Crohn disease and its hospitalization frequency was from studies in Olmsted County, Minnesota.7–9 The prevalence of cystic fibrosis was estimated from data of the Cystic Fibrosis Foundation Registry, a national registry believed to include 75% of persons with cystic fibrosis (M. Brooks, personal communication);10,11 the prevalence computed from registry data was therefore adjusted to 100% by inflating 25%. Hospitalization frequencies were based on 21,564 persons in the registry in 1999.10 The prevalence of scleroderma was from a population-based study in Michigan,12 and hospitalization frequency (25%) was based on a cohort of over 500 patients from the University of Pittsburgh hospitals in 1996-1999 (V. Steen, personal communication).13

The number of persons hospitalized with these diseases from 1996 to 1999 was obtained from an administrative database operated by the Office of Statewide Health Planning and Development in California. The database includes patient age, sex, race, and up to 25 discharge diagnoses, along with encrypted unique patient identifiers. Persons were counted if the disease of interest was recorded as any of the discharge diagnoses. Because the number of persons hospitalized annually was similar across years, only data from 1999 are presented here.

The number of hospitalized persons was divided by the literature-based estimate of the proportion of persons with that disease hospitalized per year. Prevalence was then calculated by dividing this number by the estimated 1999 California population.14 Prevalence was age-adjusted (using direct standardization) to allow comparisons with the literature-based prevalence. The prevalences estimated using hospitalization data closely approximated the literature-based prevalences (Table 1).



This method was most accurate for hemophilia A, for which the comparison prevalence rate and hospitalization frequency were based on a multistate surveillance program with presumably a more representative sample. This method is quick and inexpensive. It requires hospitalization databases with unique patient identifiers, so that the number of patients affected can be tabulated. The more unambiguous the diagnostic coding and the more important and urgent the disease, the more accurate the resulting estimates. Additional validation of this method should be performed as more prevalence estimates for uncommon diseases become available. However, this method may be used to approximate the prevalence of uncommon chronic diseases for which no other estimates exist.

Michael M. Ward


Bethesda, MD

Back to Top | Article Outline


1. Soucie JM, Evatt B, Jackson D, Hemophilia Surveillance System Project Investigators. Occurrence of hemophilia in the United States. Am J Hematol. 1998;59:288–294.
2. Nelson LM, Hamman RF, Thompson DS, et al. Higher than expected prevalence of multiple sclerosis in Northern Colorado: dependence on methodologic issues. Neuroepidemiology. 1986;5:17–28.
3. Hader WJ, Elliot M, Ebers GC. Epidemiology of multiple sclerosis in London and Middlesex County, Ontario, Canada. Neurology. 1988;38:617–621.
4. Sweeny VP, Sadovnick AD, Brandejs V. Prevalence of multiple sclerosis in British Columbia. Can J Neurol Sci. 1986;13:47–51.
5. Mayr WT, Pittock SJ, McClelland RL, et al. Incidence and prevalence of multiple sclerosis in Olmsted County, Minnesota, 1985–2000. Neurology. 2003;61:1373–1377.
6. Whetten-Goldstein K, Sloan FA, Goldstein LB, et al. A comprehensive assessment of the cost of multiple sclerosis in the United States. Multiple Sclerosis. 1998;4:419–425.
7. Loftus EV Jr., Silverstein MD, Sandborn WJ, et al. Crohn's disease in Olmsted County, Minnesota, 1940–1933: Incidence, prevalence, and survival. Gastroenterology. 1998;114:1161–1168.
8. Loftus EV Jr. The epidemiology of Crohn's disease (letter). Gastroenterology. 1999;116:1504.
9. Silverstein MD, Loftus EV, Sandborn WJ, et al. Clinical course and costs of care for Crohn's disease: Markov model analysis of a population-based cohort. Gastroenterology. 1999;117:49–57.
10. Cystic Fibrosis Foundation. Patient Registry 2001 Annual Report. Bethesda, MD: Cystic Fibrosis Foundation, 2002.
11. FitzSimmons SC. The changing epidemiology of cystic fibrosis. J Pediatr. 1993;122:1–9.
12. Mayes MD, Lacey JV Jr., Beebe-Dimmer J, et al. Prevalence, incidence, survival, and disease characteristics of systemic sclerosis in a large U.S. population. Arthritis Rheum. 2003;48:2246–2255.
13. Steen VD, Medsger TA Jr. Severe organ involvement in systemic sclerosis with diffuse scleroderma. Arthritis Rheum. 2000;43:2437–2444.
14. State of California Department of Finance. Race/ethnic population with age and sex detail, 1970–2040. Sacramento, CA, December 1998. Accessed on 9/18/2003.
© 2005 Lippincott Williams & Wilkins, Inc.