Bayley, K. Bruce PhD*; Belnap, Tom MS†; Savitz, Lucy PhD, MBA†; Masica, Andrew L. MD, MSCI‡; Shah, Nilay PhD§; Fleming, Neil S. PhD, CQE∥
Leading health care systems have implemented electronic health record (EHR) systems and other health information technology in support of health care delivery, health services research (HSR), and professional education.1 Compared to traditional administrative data or insurance claims, EHR source data provide a more detailed picture of patient-level encounters, with a wealth of granular clinical detail.2,3 There is a growing body of published research using EHR as the core data source, and the infrastructure supporting the use of EHRs in research endeavors is being expanded.4–7 Use of EHR data for comparative effectiveness research (CER) provides a unique opportunity to address important questions that have value in practice for patients, providers, policymakers, and payers.8,9 In a recent commentary, Brook notes that HSR has contributed tools and techniques “…that have profoundly affected the way medicine is practiced.” He further notes what HSR “…has not done is revolutionize the way medicine is practiced to increase its value and to moderate costs”.10 Brook suggests that the way ahead is to encourage collaboration between practitioners and researchers to address 5 issues for transforming medical care delivery: reliability, appropriateness, frequency, labor, and transparency. EHRs currently present a valuable mechanism for enabling a partnership between health services researchers and clinicians to improve health care delivery.
We know that secondary data are also desirable from an HSR perspective because it is relatively inexpensive and can be efficient (as the data are already collected). While we begin to rely on this new source of secondary data, recognition of its limitations is an important consideration for study design and methods.
This paper describes the “real-world” experiences of 4 leading health systems in using the EHR for research, based on a practical case study involving hypertension (HTN) treatment (Box 1). These 4 systems had previously made a collective presentation of their independent work using EHR data. In this paper, we present the origins and nature of the challenges encountered in conducting a comparative effectiveness study and suggest both specific and general strategies to deal with these issues.
Although the study was exploratory rather than theory based, we add to the emerging literature on frameworks for data quality assessment by referencing our parallels to 2 proposed schemas along the way.11,12 We find the pragmatic approach taken by Kahn and colleagues based on an adaptation of Wang and Strong to fit most closely with our observations.
We illustrate the challenges involved in using EHR data for CER using a replicated case study approach.13 The specific research question seeks to evaluate the comparative effectiveness of antihypertensive medications on longitudinal blood pressure control in a population of patients with documented HTN diagnoses followed in community-based primary care practices with EHRs. On the basis of the nature of HTN (and its associated sequelae) as a chronic condition, blood pressure control outcomes are assessed over a 3-year time period. Such a study highlights issues that are commonly found when using ambulatory data from multiple clinics. Importantly, the case study design requires identifying patients with new diagnoses of HTN (rather than preexisting) that were traceable within the EHR.
Researchers from 4 health systems attempted to extract data from their EHR systems for this CER example, noting which electronic system needed to be accessed, the data elements that were straightforward to capture, and those that posed problems. Strengths and limitations of the EHR were noted, as were suggested strategies for overcoming specific issues. To guide the work, each delivery system followed a detailed set of data specifications (Table 1), in order to test whether a cross-health system study could be performed with comparable data.
Each institution was able to easily identify an overall cohort of patients with HTN, but important differences emerged when refining the patient population to meet study criteria using information from the EHR. In 1 institution (Baylor), most ambulatory sites transitioned to the EHR during the desired study period of 2007–2011. This limited the number of patients within the Baylor records with adequate 3-year follow-up after their HTN diagnosis. Other institutions with longer history of EHR use were able to capture the entire patient cohort. It should be noted that “disease onset” is rarely captured within an EHR. By defining the population as patients with at least 12 months of previous care by their provider we limited the number of patients who may have had undetected disease before the date of their “hypertension visit” (the visit where high blood pressure was noted and placed in the problem list).
Providers may modify diagnoses during follow-up visits so that the dates of problem entries in the medical record may not reflect the true onset date. The fields for the status of specific diagnoses are intentionally modifiable at each encounter to account for changes in patient condition. This is generally less applicable for a chronic diagnosis like HTN compared to acute (eg, pneumonia) or symptom-based (eg, abdominal pain) problems that either resolve or are turned into more specific diagnoses. However, it is possible that a patient originally diagnosed with HTN who modifies salt intake, loses weight, and begins to exercise would subsequently improve their blood pressure and appropriately have that condition removed from their EHR diagnosis list. Likewise, the original diagnosis might also need to be revised (eg, from “essential hypertension” to “hypertensive kidney disease”) based on the patient’s course.
Medication prescription information was available at all institutions (Baylor, Intermountain, Mayo Clinic, Providence). However, EHR data at these sites generally only captures drug orders; it does not contain information on whether patients filled the initial prescription and were subsequently adherent to their prescribed regimen. Some institutions had partial prescription filling information (Intermountain, Mayo Clinic) through their health plan but only for a subset of patients. Longitudinal medication information may also be inaccurate because EHR medication lists are not always updated when treatment changes occur particularly if the patient saw another physician using paper records or an EHR platform differing from the one used by the primary care network (Intermountain, Mayo Clinic, Providence).
The cross-case comparison revealed several areas that require special attention when using EHR data for CER. Some of these were general issues that extend beyond our approach, but need to be considered. Others were apparent in the variation and limitations in the data that were compiled in our comparative analyses.
In order to generalize these issues into more actionable themes, the findings have been categorized into 5 general areas: missing data, erroneous data, uninterpretable data, inconsistencies among providers and over time, and data stored in noncoded text notes. These categories allow for an analysis of the lessons learned based on case study findings. Table 2 provides a detailed summary of challenges encountered.
In a “Fit for Use” data quality model the degree and distribution of “missingness” in the EHR is an example of a conceptual data quality feature that can be extremely important in the context of a CER study.11 The absence of certain data fields can limit the outcomes to be studied, the number of explanatory factors considered, and even the size of the population included.
As EHR data are captured at the point of care by health care providers, patients who do not regularly interact with the health system (or if they interact with multiple nonintegrated health care providers) may have incomplete data. In the case of HTN, patients may have encounters at irregular intervals, in contrast to the consistent data collection protocols used in HTN clinical trials. Reduced data frequency does not necessarily mean that patients are healthy between visits. They may be getting care elsewhere, or be restricted by insurance or financial constraints. Even if patients are being seen, there may be gaps in what information is collected during these encounters. Some providers may not collect the same details as others or collect data at the same intervals.
Suggested strategies to overcome barriers caused by missing data depended on the type and proportion of data that is missing and how it relates to the outcome of interest. If data elements were simply unavailable, some researchers suggested exploring alternative datasets such as payer data. Others suggested that their electronic systems might contain surrogate data elements that have similar information to the elements that are unavailable (eg, an ICD-9 code for a complication of HTN could identify patients with poorly controlled blood pressures if the actual blood pressure measurements were sparse). A limitation of this approach is that complication codes may not be used frequently or consistently across systems such that using them may introduce further biases. Finally, statistical techniques were suggested as a way to account for some effects of missing data, particularly if data were available for some patients or for all patients at different time intervals.14 Where incomplete data are a result of patients leaving a system such that their status can no longer be observed, statistical approaches to dealing with censored data can be used.
EHR data within these health systems span a variety of patients, conditions, service areas, and geographic locations. Most data are entered by busy practitioners in the course of a visit, supplemented by additional notes or entries made at the end of the day (based on recall or jotted notes). It is, therefore, common to find errors in EHR data, or in other words, a lack of intrinsic accuracy.11
In the HTN case study, we encountered variation in how blood pressure data are collected and who collected the data. Fluctuations in the observed values were attributed in part to the varied experience of the caregiver, the equipment used, and the measurement technique. We also found blood pressure values that would be classified as recording errors in the medical record system.
Suggestions for identifying and correcting erroneous data emphasized a validation process for the EHR data before it is used for research. Both internal and external validation techniques were proposed. Several of these correspond to suggested methods for addressing the intrinsic data quality feature of believability.11 Internal validation consists of validating the data using the data itself. This can be done by looking for unrealistic values (a blood pressure that is too high or low). Dates can be used to make sure results are not documented before tests were carried out and that procedures are performed in a logical order. For example, dates can be used to verify that knee replacement revisions are not documented as being done before a primary total knee replacement. This is an example of a state-dependent rule.11 External validation involved looking into the EHR or other patient information to ensure that the values recorded match what was observed. For example, coded values can be compared to information in text notes and data elements can be compared to historical values.
Researchers from the 4 systems suggested that validation would help to understand the quality of the data and the types of errors that are present in the data. In some instances, errors can be corrected electronically if they follow a specific pattern. In other instances small numbers of outliers can be looked up and corrected. Also, outliers can be identified and then accounted for in the analysis.
As EHR data are entered by a large number of individuals at multiple locations, data may be entered based on different definitions. Data may be recorded without specifying units of measurement. Qualitative assessments (eg, in or out of control, mild moderate or severe, etc.) may be difficult to interpret. As mentioned above, medication adherence data is often missing. If validated psychometric scales to assess patient status are not used, one clinician’s assessment may or may not be comparable to another’s, making it impossible to interpret what is being reported. These problems fit Kahn et al’s11 category of intrinsic data quality features that lack objectivity.
Systems with more robust EHRs had fewer of these issues. Some EHRs only allow numeric data or valid dates in certain fields or make certain data elements required in order to ensure important interpretability information is collected. In all cases, however, an understanding of the data collection process can be helpful to identify rules that will allow the data elements to be interpreted.
Data collection tools and techniques may vary over time and across institutions. Data coding rules or data system capabilities change over time. Providers at 1 location may collect data differently than providers at other locations. These types of problems are sometimes termed “representational inconsistency.”12 In the HTN case study, there were several problem list options corresponding to HTN. These options were used with varying frequency at different locations. Data on some patients were not available for the entire study period due to variability in when clinics transitioned to EHR documentation. Also, blood pressure information was not collected at consistent intervals for patients.
Inconsistent data can lead to inaccuracies or study bias if it is not taken into account. Some inconsistencies can be identified directly from the data. Others require an understanding of how data are collected geographically and over time. If data are collected consistently at specific facilities or for specific time periods, they may be adjusted to account for differences over the study period. Finally, statistical methods can be used to account for unequal measures over time.10
Unstructured Text Data
Despite the increasing sophistication of EHRs, a large portion of treatment data is still collected as unstructured text. Although this information is easy to understand on an individual level to treat an individual patient, it is problematic for research. Some data elements were only captured as text (eg, date of onset and notes about medication use) and therefore had limited utility for CER. Other data elements had dedicated coded fields, but nevertheless the data were written into a note rather than entered into the coded field. We suspect this occurs when coded fields are located away from other similar data elements or when their use is cumbersome or inconvenient. In any case the unstructured text residing in the EHR represents not only data with poor accessibility, but may suffer from other data quality issues such as a lack of objectivity, consistency, or completeness.12
In some health systems, data extraction techniques such as natural language processing (NLP) are being used with increasing frequency to identify information directly from text notes.15 Depending on the type of information and the level of text note structure, it was sometimes possible to extract useful data from unstructured text. However, depending on the sample size, type, and prevalence of the data elements, it was not always worth the time and resources necessary to mine the text for useful data.
Table 3 provides a summary of the strengths and limitations of EHR data with regard to each issue, along with lessons learned during the study.
Despite the sophistication of the data systems at these leading health care systems, a number of EHR data use challenges are likely to hamper the extraction of valid data for comparative effectiveness studies. Our replicated case example illustrates these challenges and suggests strategies for overcoming them. Researchers are cautioned against a blind reliance on EHR data as a singular, definitive data source.
At the same time, there is no question that EHR data are superior to administrative or claims data alone, and are cheaper and timelier than clinical trials or manual chart reviews. With some exceptions (eg, information on prescription medication filling) EHR data can yield a more detailed picture of a patient’s health and health care. Researchers at all 4 health systems believe that the effort to use and improve EHR data for CER is definitely worthwhile and all are actively pursuing this path.
We see 7 longer-term developments that are likely to improve the utility and quality of EHR data for CER. These include: interaction with clinicians through feedback reporting, use of EHR-based registries, use of data warehouses that merge EHR and non-EHR data, team-based care/communications, use of NLP to “unlock” useful information trapped in free-text fields, and exploring episodes of care across the continuum.
First, the continued production of feedback reports to clinicians using EHR data can serve to demonstrate value of population data in practice, highlight documentation issues, and encourage better documentation. EHR data can be a valuable resource for clinicians who want to manage their patient populations effectively, compare favorably to their peers or to external reviewers on quality measures, and deal with performance auditors whose attitude is “not documented, not done.”
Second, management of selected subpopulations and clinical epidemiology relies on specialized registries built off the EHR (eg, clinic/medical home, chronic disease, immunization, children with special health care needs). These registries themselves are a boon to data quality because they force definition of clinical parameters that are going to be stored and displayed within the registry. EHR-derived registries are a great advance over stand-alone databases that require manual chart review.
Third, the development of data warehouses to store EHR and other data serves a similar function as the EHR registry, with additional advantages. Beyond the clarification of data definitions, the warehouse can store meta-data that explain data context and meaning. Moreover, the data warehouse permits the linking of EHR data to other sources such as financial systems using an enterprise-wide patient identifier. These multifaceted datasets are among the most interesting for research, linking utilization, quality, cost, and measures of patient experience.16
Fourth, NLP appears to be an extremely powerful mechanism to access data otherwise trapped in EHR free-text fields, with a wide array of potential applications.17,18 The Agency for Healthcare Research and Quality is currently funding a grant aiming to create a common, accessible NLP platform that can interface with EHR data to conduct CER.19 It is anticipated that this platform will ultimately be used by a broad community of CER researchers.
Fifth, the movement of primary care practices to a patient-centered medical home model may aid in accurate and standardized documentation. The patient-centered medical home emphasizes team-based care, making it imperative for each team member to be able to find key clinical data and have a shared understanding of its meaning.
Sixth, the need to follow patients across settings and institutions is creating a demand for clinical applications that can be linked together to provide a complete patient picture for episodes of care. Health systems that were content to pick the “best of breed” applications for each clinical area are now demanding more connected and complete EHR systems.
Finally, reimbursement reforms related to measuring components of service and quality may require documentation in coded fields, facilitating data capture in some instances. For example, some global payment mechanisms require quality metrics based on clinical documentation rather than billing codes. However, payment reform may not incentivize repeated collection of patient-reported outcomes, which are of emerging importance to CER studies. Issues of completeness, consistency, and interpretability may dog these new measures unless an effort at standardization is made.
Improving data quality within the EHR in order to facilitate research will remain a challenge as long as research is seen as a separate activity from clinical care. A more useful approach is a partnership between researchers, clinicians, and IT professionals that focuses on research of value to the care team. In turn, we believe that working with researchers helps clinicians understand the importance of data quality for improvement, innovation, and discovery. Discussions of documentation and debates about data definitions can be a productive part of clinical care conversations—what patient data are most important, what should be tracked and trended, and how the EHR can be effectively shared among the team.
Improving EHR data quality is also an important area for further research. Are certain data fields more valid because they are clinically more important to providers? Are there aspects of the user interface that can improve data accuracy and completeness? Does clinician feedback improve quality? Are some disciplines better than others in terms of data quality? Does automated data entry (eg, for vital signs) hold promise?
The pursuit of high-quality EHR data for CER presents a major advancement in data availability to address important questions, drive improvement and efficiency, and support care delivery transformation. In the long run, clinicians and researchers can be productive partners in mining the valuable resource represented by the EHR.
1. Caruso D, Kerrigan C, Mastanudo M .http://tdi.dartmouth.edu
Improving the value-based care and outcomes of clinical populations in an electronic health record system environment. The Dartmouth Institute for Health Policy & Clinical Practice [serial online] 2011. Available at: http://tdi.dartmouth.edu. Accessed June 2, 2012
2. D’Avolio LW, Farwell WR, Fiore LD .Comparative effectiveness research and medical informatics.Am J Med. 2010; 123:12 Suppl 1 e32–e37.
3. Weiner MG, Embi PJ .Toward reuse of clinical data for research and quality improvement: the end of the beginning? Ann Intern Med. 2009; 15:359–360.
4. Safran C, Bloomrosen M, Hammond WE, et al .Toward a national framework for the secondary use of health data: an American Medical Informatics Association White Paper.J Am Med Inform Assoc. 2007; 14:1–9.
5. Maro JC, Platt R, Holmes JH, et al .Design of a national distributed health data network.Ann Intern Med. 2009; 151:341–344.
6. Pace WD, Cifuentes M, Valuck RJ, et al .An electronic practice-based network for observational comparative effectiveness research.Ann Intern Med. 2009; 151:338–340.
7. Brown JS, Holmes JH, Shah K, et al .Distributed health data networks: a practical and preferred approach to multi-institutional evaluations of comparative effectiveness, safety, and quality of care.Med Care. 2010; 48:suppl 6 S45–S51.
8. Masica MD, Collinsworth MPH .Leveraging Electronic Health Records in Comparative Effectiveness Research.Prescriptions for Excellence in Health Care Newsletter Supplement. 2012; 1:6
9. Slutsky JR, Clancy CM .Patient-centered comparative effectiveness research: essential for high-quality care.Arch Intern Med. 2010; 170:403–404.
10. Brook RH .Health services research and clinical practice.JAMA. 2011; 305:1589–1590.
11. Kahn MG, Raebel MA, Glanz JM, et al .A pragmatic framework for single-site and multisite data quality assessment in electronic health record-based clinical research.Med Care. 2012; 50:S21–S29.
12. Wang RY, Strong DM .Beyond accuracy: what data quality means to data consumers.J Manage Inf Syst. 1996; 12:5–33.
13. Yin RK .Applications of Case Study Research. 2011; :3rd ed.Thousand Oaks, CA:SAGE Publications, Incorporated.
14. Little RJ, Rubin DB .Causal effects in clinical and epidemiological studies via potential outcomes: concepts and analytical approaches.Annu Rev Public Health. 2000; 21:121–145.
15. Murff HJ, FitzHenry F, Matheny ME, et al .Automated identification of postoperative complications within an electronic medical record using natural language processing.JAMA. 2011; 306:848–855.
16. Duvall SL, Fraser AM, Rowe K, et al .Evaluation of record linkage between a large healthcare provider and the Utah Population Database.J Am Med Inform Assoc. 2012; 19:e54–e59.
17. Hazlehurst B, Mullooly J, Naleway A, et al .Detecting possible vaccination reactions in clinical notes. AMIA Annu Symp Proc. 2005; 306–310.
18. Hazlehurst B, Sittig DF, Stevens VJ, et al .Natural language processing in the electronic medical record: assessing clinician adherence to tobacco treatment guidelines.Am J Prev Med. 2005; 29:434–439.
Enhancing Clinical Effectiveness Research with Natural Language Processing of EMR Brian L. Hazlehurst, Kaiser Foundation Research Institute (R01 HS19828-01)