Skip Navigation LinksHome > November 2009 - Volume 20 - Issue 6 > Evaluating Geographic Imputation Approaches for Zip Code Lev...
doi: 10.1097/01.ede.0000362296.71466.49
Abstracts: ISEE 21st Annual Conference, Dublin, Ireland, August 25-29, 2009: Oral Presentations

Evaluating Geographic Imputation Approaches for Zip Code Level Data: An Application to a Study of Pediatric Diabetes

Hibbert, James*; Liese, Angela*; Lawson, Andrew†; Porter, Dwayne*; Puett, Robin*; Standiford, Debra‡; Liu, Lenna§; Dabelea, Dana¶

Free Access
Article Outline
Collapse Box

Author Information

*University of South Carolina Arnold School of Public Health, Columbia, SC, United States; †Medical University of South Carolina, Chaleston, SC, United States; ‡Children's Hospital Medical Center, Cincinnati, OH, United States; §Seattle Children's Hospital Research Institute, Seattle, WA, United States; and ¶University of Colorado, Denver, CO, United States.

Abstracts published in Epidemiology have been reviewed by the organizations of Epidemiology. Affliate Societies at whose meetings the abstracts have been accepted for presentation. These abstracts have not undergone review by the Editorial Board of Epidemiology.


Back to Top | Article Outline
Background and Objective:

There is increasing interest in the study of place effects on health, facilitated in part by geographic information systems. Incomplete or missing address information reduces geocoding success. Several geographic imputation methods have been suggested to overcome this limitation. Accuracy evaluation of these methods can be focused at the level of individuals, and at higher group-levels (e.g., spatial distribution).

Back to Top | Article Outline

We evaluated four geo-imputation methods for address allocation from ZIP codes to Census tracts at the individual and group level. Two fixed (deterministic) and two random allocation methods were evaluated, using land area or population under age 20 as weighting factors. Data included 2,126 geocoded cases of incident diabetes mellitus among youth aged 0-19 between 2002 and 2003 in four U.S. regions. The imputed distribution of cases across tracts was compared to the true distribution using a chi-squared statistic.

Back to Top | Article Outline

At the individual level, population-weighted fixed allocation showed the greatest level of accuracy, with correct census tract assignments averaged 30.45% across all regions, followed by the populated-weighted random method; 21.07%. Distribution of cases across Census tracts was: 58.2% of tracts exhibited no cases, 26.2% had one case, 9.5% had two cases, and less than 3% had three or more. True distribution was best captured by random allocation methods, with no significant differences (P-value > 0.90). However, significant differences in distributions based on fixed allocation methods were found (P-value < 0.0003).

Back to Top | Article Outline

Results indicate fixed imputation methods yield greatest accuracy at the individual level, thus indicating their use for studies focusing on distances to exposure sites. Fixed methods result in artificial clusters in single Census tracts. For studies focusing on spatial distribution of disease, random methods seemed superior, as they most closely replicated the true spatial distribution. When selecting an imputation approach, researchers should consider carefully the study aims.

© 2009 Lippincott Williams & Wilkins, Inc.

Twitter  Facebook 


Article Tools


Article Level Metrics