Depression is the most common of the mental disorders, with a lifetime prevalence of nearly 21% among adults in the United States.1 The occurrence, treatment challenges, and progression2 of many chronic diseases (eg, diabetes, cancer, cardiovascular disease, asthma, and obesity) are worsened by concomitant depression, as are many health risk behaviors (eg, physical inactivity, smoking, excessive drinking, and insufficient sleep). Estimates suggest that depression will be the second leading cause of disability worldwide by 2020, trailing only ischemic heart disease.3 Stigma associated with mental illness4 often obscures our ability to identify this condition accurately as some patients may be hesitant to report symptoms during an encounter or even seek help.5 Personal and cultural overtones have delayed health-seeking behavior, reducing reach, quality, and cost-effectiveness of depression care and opportunity to achieve better outcomes for associated health conditions.6
Community members consistently identify depression and other mental health disorders as high priorities for public health interventions.7 Disparities by demographic group have been observed in national studies.8 In response, local public health agencies seek effective means to identify and address mental health disorder (especially depression) disparities in their jurisdictions. Targeted intervention efforts may be broadly implemented at a county level but often smaller geographic areas (eg, communities or neighborhoods)9 are the real focus. These geographic regions often represent shared cultures and economic perspectives, which may permit more targeted and tailored intervention messages.10 However, little data exist to accurately estimate subcounty depression prevalence rates. As many public health agencies incorporate mental health initiatives in their community health improvement plans, they need more granular estimates of the prevalence of mental health disorders to frame the problem and effectively engage community partners around issues for their region. Accurate information would also permit local public health agencies to evaluate their effectiveness of targeted, evidence-based (both clinic-11,12 and community-based13) mental health interventions for community residents. While national, state, or local depression prevalence rates may be estimated from federally sponsored surveys,14,15 these rates are rarely current or granular enough to support targeted community-based interventions within a jurisdiction.
Electronic health records (EHR) have demonstrated utility in providing surveillance data on issues of public health importance16 (ie, adverse drug and device events) including specific diseases or conditions17–20 (ie, diabetes mellitus and hepatitis B). Some data-sharing technologies16,21 may enhance the ability of EHR-derived data to be harvested across health care providers to generate information that complements surveys. With EHR data and increased sample size, smaller demographic subgroups and geographic units are better represented within a jurisdiction, based on a patient's characteristics and residence.
This study was undertaken to better understand novel EHR-based surveillance opportunities and their capacity to complement existing survey data for depression. Our specific goals were to (1) compare the attributes (ie, diagnostic method, specificity, representativeness, and geographic granularity) of EHR-based depression surveillance versus previously published reports for a single urban community and (2) assess subcounty variation in EHR-generated depression prevalence estimates in an urban area. We sought to understand how a complementary surveillance source might inform a community seeking methods to address a common disease such as depression.
The City and County of Denver, Colorado's state capital, has a population of about 650 000 with a large Hispanic/Latino population (24%) and smaller African American population (10%).22 Kaiser Permanente Colorado provides care to more than 600 000 Coloradoans (including more than 100 000 in Denver County), and Denver Health (DH) cares for more than 150 000 Denver residents. Collectively, these 2 integrated delivery systems care for nearly 40% of Denver County's population in distinctly different population subgroups. Kaiser Permanente Colorado offers services largely to employed individuals and their families, while DH, a safety-net organization, serves more economically challenged individuals and families.
Inventory of data sources and data source evaluation
We first conducted a PubMed search for published depression estimates to identify commonly used national and local sources of data that might provide information on depression prevalence in Denver County; results from these articles with a prevalence estimate were compared with prevalence estimates from KPCO and DH. The inventory yielded prevalence data from the Behavioral Risk Factor Surveillance Survey (BRFSS),14 National Comorbidity Survey,23 the National Survey on Drug Use and Health,24 and the National Health and Nutrition Examination Survey,15 as well as from 8 managed care organizations across the United States participating in the Mental Health Research Network.25 Those data sources varied by collection method (survey vs administrative data), cohort selection schema (random vs convenience sampling), population included (community-dwelling individuals vs individuals receiving health or mental health care services), measurement method (eg, structured interview questions, symptom severity questions, or diagnosis [International Classification of Diseases, Ninth Revision (ICD-9)] codes), cohort size, time frame, and geographic location. For each data source publication, a review abstracted the sample size, prevalence rate, timeliness (eg, most recent or survey frequency), granularity or geographic location (eg, lowest geo-spatial level of analysis for reporting), and method (eg, screening, related-questions, or diagnosis).
Electronic health record data
Both KPCO and DH have EHR systems with access to diagnostic data recorded by clinicians after each encounter. As part of a community initiative, the Colorado Health Observation Regional Data Service,26 both institutions have stored their EHR data in a common data model, the Virtual Data Warehouse originally developed by the Health Care Systems Research Network.27 This is a data model used by many health care institutions across the country that participate in the PCORnet initiative.28 The regional service uses a query technology21 implemented in several large federal initiatives,16,29 which has been used at the local level as well.17,19,30 The public health surveillance use of CHORDS was reviewed and deemed nonhuman subjects research by the Colorado Multiple Institutional Review Board.
We restricted the analysis to adults 18 years of age or older who received care in either system between January 1, 2011, and December 31, 2012. We retrieved demographic data (ie, age, gender, and residential address) from EHR at DH and KPCO, along with diagnostic codes (ICD-9) for all outpatient visits. Depression was a common diagnosis in both systems and is recorded by a clinician based on a clinical encounter.31 Any adult with at least 1 depression diagnostic code (ie, mood disorder = ICD-9: 296.x, depressive-type psychosis = 298.0, adjustment reaction = 309.x, major depressive disorder = 311) was considered to have a diagnosis of depression. To be included in this geo-spatial analysis, a geo-locatable residence address needed to be established, based on the address declared at the last visit during the time interval. Thus, all homeless individuals were excluded from mapping visualizations.
Using 5-year (2008-2012) American Community Survey denominator estimates, we first calculated the proportion of residents in each census tract who met our diagnostic criterion for depression, based on the combined total patient population data from 2 health care data sources, divided by the American Community Survey estimated base population. Age-gender pyramids were generated to compare the clinical population with the general population. An age- and gender-adjusted depression prevalence rate was also calculated for the county as a whole. An unadjusted depression prevalence rate was calculated for each census tract in Denver County. Prevalence and standard error of the mean (SEM) were calculated for the jurisdiction and each subgroup. Age and gender adjustment were then performed to more closely approximate the general population distribution.30 A finite population correction32 was performed, given the nonrandomness of selection into the clinical population (eg, having a means to pay for care and care-seeking behavior). Once calculated and adjusted, the depression prevalence rates by census tract were represented geospatially using GeoDa software.
Summarized CT-level data were imported into GeoDa (Version 0.925) for a spatial analysis of depression prevalence. Box plots, box maps (Hinge = 1.5), and histograms identified lower and upper outliers' values and location as well as statistical measurements. An adjustment (ie, smoothing and weighting) of upper and lower outlying rates was used to reduce rate variability associated with population differences. To minimize variance instability of depression prevalence, we used spatial rate smoothing methods combined with Queen Contiguity spatial weighting.33,34 Rate estimations varied on the basis of whether a CT (1) shared a common border or common vertices with, or (2) had greater proximity to another CT. Weighting and smoothing methods were combined to optimally produce the fewest outliers and most dense neighborhood clusters; local autocorrelation was determined using the Local Indicators of Spatial Association.35
Census tracts were scored for weighted depression prevalence rates using a simple scoring system developed to identify clusters. High-high was defined as a high-value depression prevalence CT neighboring on at least 1 other high-value depression prevalence CT. The inverse, or low-low, indicates a low-value depression prevalence CT near another low-value prevalence CT. Each may indicate potential areas of interest.
Our initial inventory identified 6 sources of information about estimated depression prevalence rates that produced 9 different estimates based on defined population, time frame, and geographic location. Results are summarized in ascending order in Table 1. Reflecting the diversity of methods used to assess depression, the overall rate varied from 7% to nearly 18%. The next to last line of the table used data calculated from the combined DH and KPCO EHR systems for patients who were residents within Denver County. The prevalence estimate of 12.7%, from DH and KPCO EHR data, was in the middle of the range generated by these data sources.
When DH and KPCO patients were pooled, 36% of the adult residents of Denver County were represented in the data (Table 2). Population coverage rates varied between 11% and 45% across census tracts. Denver resident coverage varied by demographic group with higher coverage among Hispanic (34%), African American (38%), and mixed race or unknown (55%) than for whites (19%) or Asian/Pacific Islanders (19%). Age-gender pyramids for Denver County and the EHR-observed subpopulation were aggregated and compared in Figure 1. In the EHR-based population, the groups between 20 and 49 years of age were underrepresented compared with Denver County as a whole. The proportion of men who received care in these institutions was lower than their proportion in the city as a whole.
Among 21 578 patients with a diagnosis of depression, 55% had at least 2 visits with the diagnosis while 45% had just 1 visit with a concordant diagnosis. The unadjusted prevalence of depression was 12.7%. Rates of depression differed by gender, race/ethnicity, and age (Table 2). Women had a higher rate than men (15.7% vs 8.8%, respectively). Whites had the highest rate (16.3%) and Asian/Pacific Islanders had the lowest rate (6.4%). Across the life span, increasing age was associated with higher rates of depression. Individuals aged 18 to 24 years had the lowest rates (6.8%) while those older than 75 years had the highest rates (20.7%). The average number of cases in a census tract was 143 (SEM ± 5), while the average number of patients per census tract was 1150 (SEM ± 142). The age-gender–adjusted rate for depression prevalence rate for Denver County was 12.3%, with census tract-specific rates ranging from 5% to 20% across census tracts. While it is impossible to estimate coverage for the base population who are homeless, homeless patients had the highest depression rate of any demographic group (17.9%). Depression prevalence rate estimates by census tract are presented in Figure 2a.
Local autocorrelation spatial rate smoothing with Queen Contiguity weighting under the randomization test had a pseudo P value of ≤ .001. The cluster map in Figure 2b shows 2 predominant positive (high-high) areas in the southeast and southwest areas of the county and 2 predominantly negative (low-low) areas across the northern board of the county. Autocorrelation demonstrated clusters with 13 tracts in dark red (high prevalence) and 17 in dark blue (with low prevalence).
Multiple published data sources have estimated depression prevalence at various jurisdictional levels, but none was sufficiently granular to offer subcounty depression prevalence estimates for Denver County. Electronic health record–based depression prevalence estimates permitted more granular depression prevalence monitoring. National surveys permit national- and state-level estimates, but local public health agencies seeking disparity measures would find it difficult to estimate subcounty (eg, zip code, neighborhood, or census tract) depression prevalence from these data. Prevalence of depression varied greatly across census tracts within the same county; the cause for variation may be multifactorial but may represent underdiagnosis for some groups or geographic regions. What community-based interventions might be applied? These alternative surveillance methods, with capacity for more granular estimates, may have value as assessment tools for public health interventions.
With the exception of the MHRN study, all compared data sources (Table 1) were survey-based. In this study, the EHR-derived prevalence estimate for depression across 2 systems was 12.7%, roughly in the midrange between the low estimate of 6.7% obtained from the National Survey on Drug Use and Health and the high estimate of 17.9% derived from the Colorado BRFSS for Denver County.36 Electronic health record estimates found higher rates among women than men (15.7% vs 8.8%, respectively), but BRFSS37 data showed less difference (7.8 vs 6.2, respectively). The BRFSS-based rates of depression varied by age. Younger individuals had higher rates of depression compared with EHR-based estimates where older individuals had higher rates. Differences in questionnaire design and method of administration may lead to varying levels of certainty for case definitions by survey type. The National Survey on Drug Use and Health defined Major Depressive Episode consistent with the fourth edition of the Diagnostic and Statistical Manual of Mental Disorders, which specifies “a period of at least 2 weeks when a person experienced a depressed mood or loss of interest or pleasure in daily activities and had a majority of specified depression symptoms.” For BRFSS, the question was: “[Were you] ever told you have a depressive disorder (including depression, major depression, dysthymia, or minor depression)?” Results could vary dramatically on the basis of question or method, as compared with EHR documentation by a clinician during the course of care. Methods for clinical documentation and assessment are fairly similar across institutions and time; thus, EHR data offer a complementary and consistent assessment tool with ease of repeated measures for populations over time.
The EHR data from 2 systems identified significant variation across Denver's neighborhoods and census tracts. Previous analyses have shown relatively stable estimates of depression across the 2 health care systems.31 Distribution of depression prevalence rates across census tracts permits aggregation to larger geographic units that are particularly meaningful to specific audiences, such as neighborhood residents or city council members, for targeted engagement with community-based organizations or city government. While challenging to develop, emerging query solutions19,38 for aggregated data across health care providers are initial tools for a learning health system28 that leverages EHR data. These emerging more granular sources of information have promise to fill localized measure gaps in communities across the country, while complementing national and regional survey measures.
Several limitations exist in this approach. Comparison of prevalence estimates was predicated on varying definitions of depression from the various data sources. Differing methods for establishing the outcome (eg, questionnaire, survey, or clinical observation) make comparisons problematic. Perhaps more importantly, however, is to understand how complementary definitions provide different perspectives. Behavioral Risk Factor Surveillance Survey is focused on lifetime prevalence while the period of time used to capture depression diagnoses via EHR for this study was just 2 years. Estimates may not be comparable but point to the challenge for public health agencies trying to assess the problem, define a public message and scope, or target a response. No clear gold standard exists with which to compare these measures. While these inherent challenges emerge from using new tools, consistent repeated measures using this 1 tool may help monitor and evaluate community-based interventions.
Our study was unable to unduplicate patients who were seen in both systems over this 2-year period. Because we used deidentified data, these individuals would be double-counted. From prior local analyses, this number was estimated at 8.5% (A. J. Davidson, MD, MSPH, written communication, 2009). Although no national personal identifier exists to facilitate deduplication, a potential solution to this problem is to use the master patient index of a local health information exchange or a statewide initiative as currently funded by the Centers for Medicare & Medicaid Services39 during subsequent analyses. Efforts to use these approaches are ongoing in Denver County. This problem of duplicate counts may increase over longer observation periods as individuals change health insurance coverage or sites of care. Use of last address may result in misclassification. If a person does not update the address (which typically happens at each visit), cases may be assigned to the wrong census tract.
Another small but important limitation of a geographic analysis is the exclusion of homeless individuals. While the homeless had the highest rate of depression in our sample, there is no method to represent them on a map. Specific outreach programs to those communities will need to employ alternative methods that target these individuals through places of congregation and social service delivery.
In addition, diagnostic codes for depression may lack sensitivity and specificity when compared with “gold standard” interviews. We selected at least 1 depression diagnosis for inclusion but would have generated more conservative estimates by using 2 or more depression diagnoses. During the 2-year period, many patients may not have repeat diagnosis-coded visits, if they are stable and controlled on medications. Even if collecting survey information on larger numbers of individuals at the subcounty level were feasible, the wide range in survey-based prevalence estimates (Table 1) emphasizes the problems with using even traditional data sources to support assessment of local public health efforts to combat depression.
Implications for Policy & Practice
- Depression and mental health issues are highly prevalent diagnoses and frequently associated with poor health outcomes for those patients. Public health agencies should promote effective and targeted community-based interventions to complement clinical mental health treatment efforts.
- Knowing where to focus limited public health resources means that health departments have established subcounty depression prevalence measures. A sufficiently scaled, subcounty survey would be too costly.
- In the absence of local level, population-based surveys, electronic health records (EHR) provide a novel way to estimate depression prevalence. This study observed differences in depression prevalence by region and demographic subgroups.
- Presentation of these results permits more focused discussions during community and other stakeholder engagement. Cluster assessment identified both regions of higher and lower depression prevalence. Were lower rates truly areas of better mental health or areas where access or stigma interferes with clinical engagement? How might these observations be further understood or validated?
- Public health agencies should consider the opportunity and evaluate EHR system data as a surveillance tool to estimate subcounty chronic disease prevalence. In the future, by harnessing routinely collected clinical information, depression monitoring may help gauge the effectiveness of any public health campaigns.
Similar to prior survey studies, this EHR-based study found depression prevalence varied by gender, race/ethnicity, age, and living status. Some of these findings were contrary to previous published reports. Were these differences more based on method of defining disease or the population being studied? Before adoption of this alternative EHR-based surveillance method, we must better understand how the opportunity for more granular depression prevalence estimates should be balanced with concerns about selection bias (eg, care seeking individuals) in the measured population. Widespread EHR40 adoption makes nonsurvey-based methods of depression prevalence monitoring more viable. Some researchers and communities have begun work to validate these EHR estimates through neighborhood-level surveys to better assess accuracy of EHR-based estimates.38,41 This process of validation will be important to allay concerns about selection bias for those accessing and represented in an EHR-based estimate.
Most local health departments have few data to address this highly prevalent problem. Some may see opportunity to use EHR-based estimates to better describe a continuum of depression screening, diagnosis, and treatment control.42 This should be an area for active research as clinicians and public health officials seek tools to better describe mental health service gaps, assess program effectiveness, and drive public health or clinical service planning and resource allocation. This first look at EHR-based depression prevalence suggests the need for additional research to better establish EHRs as a complementary surveillance resource for public health to guide prevention, outreach, and treatment efforts and how to interpret EHR-based findings considering other factors (eg, social determinants of health43). Working with clinicians, local public health agencies can encourage system-wide changes and feedback loops to ensure early identification and adequate treatment of a highly prevalent disease with high and serious associated morbidity and mortality.
1. Kessler RC, Berglund P, Demler O, Jin R, Merikangas KR, Walters EE. Lifetime prevalence and age-of-onset distributions of DSM-IV disorders in the national comorbidity survey replication. Arch Gen Psychiatry. 2005;62(6):593–602.
2. Chapman DP, Perry GS, Strine TW. The vital link between chronic disease and depressive disorders. Prev Chronic Dis. 2005;2:A14.
3. Murray CJL, Lopez AD. The Global Burden of Disease: A Comprehensive Assessment of Mortality and Disability From Diseases, Injuries and Risk Factors in 1990 and Projected to 2020. Geneva, Switzerland: World Health Organization; 1996.
4. Corrigan P. How stigma interferes with mental health care. Am Psychol. 2004;59(7):614–625.
5. Leong FTL, Lau ASL. Barriers to providing effective mental health services to Asian Americans. Ment Health Serv Res. 2001;3(4):201–214.
6. Stiefel M, Nolan KA. IHI Innovation Series: Guide to Measuring the Triple Aim: Population Health, Experience of Care, and Per Capita Cost. Cambridge, MA Institute for Healthcare Improvement; 2012.
8. Kessler RC, Berglund P, Demler O, et al The epidemiology of major depressive disorder: results from the National Comorbidity Survey Replication (NCS-R). JAMA. 2003;289(23):3095–3105.
10. Tancredi DJ, Slee CK, Jerant A, et al Targeted versus tailored multimedia patient engagement to enhance depression
recognition and treatment in primary care: randomized controlled trial protocol for the AMEP2 study. BMC Health Ser Res. 2013;13(1):141.
11. Joffres M, Jaramillo A, Dickinson J, et al Canadian Task Force on Preventive Health Care. Recommendations on screening for depression
in adults. CMAJ. 2013;185(9):775–782.
16. Platt R, Carnahan RM, Brown JS, et al The U.S. Food and Drug Administration's Mini-Sentinel program: status and direction. Pharmacoepidemiol Drug Saf. 2012;21(S1):1–8.
17. Klompas M, Eggleston E, McVetta J, Lazarus R, Li L, Platt R. Automated detection and classification of type 1 versus type 2 diabetes using electronic health record data. Diabetes Care. 2013;36(4):914–921.
18. Klompas M, Haney G, Church D, Lazarus R, Hou X, Platt R. Automated identifıcation of acute hepatitis B using electronic medical record data to facilitate public health surveillance. PLoS One. 2008;3(7):e2626.
19. Klompas M, Murphy M, Lankiewicz J, et al Harnessing electronic health records
for public health surveillance. Online J Public Health Inform. 2011;3(3):1–7.
20. Dixon BE, Vreeman DJ, Grannis SJ. The long road to semantic interoperability in support of public health: experiences from two states. J Biomed Inform. 2014;49: 3–8.
23. Kessler RC, Berglund P, Chiu WT, et al The US National Comorbidity Survey Replication (NCS-R): design and field procedures. Int J Methods Psychiatr Res. 2004;13:69–91.
24. Substance Abuse and Mental Health Services Administration, Center for Behavioral Health Statistics and Quality. National survey on drug use and health, 2012 and 2013. https://nsduhweb.rti.org/respweb/homepage.cfm
. Accessed February 10, 2016.
25. Kaiser Permanente Division of Research (Northern California). Mental health research network. http://hcsrn.org/mhrn/en/
. Accessed February 10, 2016.
27. Ross T, Ng D, Brown J, et al The HMO research network virtual data warehouse: a public data model to support collaboration. EGEMs (Wash DC). 2014;2(1):1049.
28. McGlynn EA, Lieu TA, Durham ML, et al Developing a data infrastructure for a learning health system: the PORTAL network. J Am Med Inform Assoc. 2014;21(4):596–601.
29. Patient Centered Outcomes Research Institute. PCORnet, the national patient-centered clinical research network. http://www.pcornet.org/
. Accessed October 30, 2015.
30. Vogel J, Brown JS, Land T, Platt R, Klompas M. MDPHnet: secure, distributed sharing of electronic health record data for public health surveillance, evaluation, and planning. Am J Public Health. 2014;104(12):2265–2270.
31. Beck A, Davidson AJ, Xu S, et al A Multilevel analysis of individual, health system, and neighborhood factors associated with depression
within a large metropolitan area. J Urban
32. Burstein H. Finite population correction for binomial confidence limits. J Am Statist Assoc. 1975;70(349):67–69.
35. Fortin MJ, Dale M. Spatial autocorrelation. In: Fotheringham AS, Rogerson PA, eds. The SAGE Handbook of Spatial Analysis. Thousand Oaks, CA: SAGE Publishing; 2009.
37. Vu KO, Shupe A, Jones AM, Hamilton J, Colorado Department of Public Health and Environment. The Burden of depression
and anxiety in Colorado: findings from the Colorado behavioral risk factor surveillance system, 2008. Health Watch. 2010;76:1–6.
38. McVeigh KH, Newton-Dame R, Chan PY, et al Can electronic health records
be used for population health surveillance? Validating population health metrics against established survey data. EGEMs (Wash DC). 2016;4(1, article 27):1267.
41. Thorpe LE, McVeigh KH, Perlman S, et al Monitoring
prevalence, treatment, and control of metabolic conditions in New York City adults using 2013 primary care electronic health records
: a surveillance validation study. EGEMs (Wash DC). 2016;4(1, Article 28):1266.
42. Gardner EM, McLees MP, Steiner JF, del Rio C, Burman WJ. The spectrum of engagement in HIV care and its relevance to test-and-treat strategies for prevention of HIV infection. Clin Infect Dis. 2011;52(6):793–800.
43. Institute of Medicine. Capturing Social and Behavioral Domains and Measures in Electronic Health Records
: Phase 2. Washington, DC: The National Academies Press; 2014.
Keywords:Copyright © 2018 Wolters Kluwer Health, Inc. All rights reserved.
depression; electronic health records; monitoring; urban