While electronic medical records (EMRs) have held great promise for the abundance and efficiency of conducting research as well as performance improvement opportunities,1–7 without the implementation of comprehensive and diligent health information management (HIM) teams,8–10 EMRs may pose a greater liability for hospitals and researchers by producing an abundance of errors and incomplete records.11–15 Further, inaccurate or incomplete data reporting, when extracted for any type of research, threatens an increase in type I and type II errors,16–21 raising grave concern for how these data, once published, might mislead the future of medicine. This report details an example of how a clinically oriented EMR can produce erroneous data output; it identifies the pitfalls in EMR implementation that caused erroneous data entry and provides concise recommendations for implementing an EMR and collecting EMR data that are more prone to accuracy than fallacy. The case study provided describes a simple extraction of 2 basic airway management variables which quickly transgressed into a laborious manual review, ultimately revealing the lack of systematic documentation processes and over 200 erroneous anesthesia records. Professionals with and without expertise in HIM will benefit from the fundamental principles provided for managing EMRs as well as the accompanying recommendations for facilitating their successful use of electronic health data, whether to inform clinical decisions or for the purposes of clinical research.
The purpose of this study was to describe airway management, namely the use of a laryngeal mask airway (LMA®) versus an endotracheal tube (ETT), and airway removal timing (deep versus awake), for 2 common pediatric surgical procedures, circumcision and tonsillectomy with or without adenoidectomy. After institutional review board approval, study data for American Society of Anesthesiologists (ASA) classification I or II patients <21 years of age were extracted from anesthesia EMRs from June 2012 to 2014 (n = 1654).
A simple query of our anesthesia EMR asked 2 basic questions, “What airway device was used?” and “When and how was this airway device removed?” The first dataset indicated that 115 (7%) of the 1654 study cases , no airway devices were documented and 246 cases (15%) had no airway removal information. The abundance of missing information prompted a manual audit of these records, which exposed numerous inaccuracies within the original data. Using a combination of free-text fields and alternate data sources, an airway device was identified in 112 (97%) of the 115 cases, which originally indicated no airway device was used (Table). For the remaining 3 cases, these were relabeled as “unknown.” Further, the timing of airway removal for 156 (63%) of the 246 cases initially marked as “unknown” was also clarified (Table).
The EMR at our facility is known as the Anesthesia Information Management System. The program MetaVision Suite is the product of iMDsoft (Needham, MA). When documenting airway management, there are 9 initial options: direct laryngoscopy, LMA, intubated via LMA, flexible fiberoptic, video laryngoscope, in situ, natural airway, failed natural airway, and suspension laryngoscopy. The LMA button expands to allow for more specific documentation about the LMA. The other 5 buttons on the top row (see Figure 1) expand to allow for more specific documentation about the ETT.
When documenting how the airway was removed, there is a section for “LMA Removal,” where providers can select a checkbox for either awake or deep. “Extubation ETT” is not as straightforward. Here, there are multiple clinical options available with extubated deep being one of them. While there is no option for “extubated awake,” this condition can be assumed to be the case if the following commands box is checked (Figure 2).
This particular EMR’s configuration does not allow for autopopulated, paired airway insertion/removal technique scripts based on device selection, as may be the case with other programs. Therefore, both “extubation” fields are provided regardless of the technique used to manage the airway. A review of the Anesthesia Information Management System also revealed that there are several other, less obvious locations to document these variables within the record, all which lack a cross-population function to ensure the information is transferred to the appropriate related fields. Further, issues with redundancy in data entry and documentation location were enabled by a lack of hard stops in the system. However, this feature was deliberately omitted to encourage providers to document airway selection and removal information as part of the natural flow of clinical care. Although electronic data entry was designed to ease documentation for clinicians, it has created significant complexities and complicated the data analysis process, making extraction for research purposes both onerous and highly susceptible to error. Thus, encumbered by redundancies and omissions in data entry points and a lack of hard stops to encourage provider documentation in one location, autoextracted data must be regarded with a level of skepticism about the accuracy and truth that it offers.
RISKS OF USING EMR DATA: WHAT EVERY PROVIDER NEEDS TO KNOW
Ultimately, the root cause of both the aforementioned documentation errors and the inaccuracies that plagued the EMR-extracted data set stemmed from (1) inadequate knowledge on the part of our clinical research team regarding the configuration of the anesthesia EMR and how anesthesia providers document airways and airway management; (2) insufficient communication between our clinical research team, the department’s HIM team, and anesthesia providers; (3) our failure to recognize how time constraints during certain procedures impact clinician documentation; and (4) a lack of understanding regarding how EMR configuration impacts medical record accuracy, completion, and overall data quality. The following details lessons learned.
Free-Text Fields, Discrete Fields, and Forced Fields: What Are They and How Do They Affect Data Quality?
Understanding the format of medical record fields and how this impacts the front-end entry of health data are critical in understanding issues with data quality.22 Fields are often formatted in 1 of the 2 ways: free-text or discrete. Free-text fields allow for the entry of unstructured text within the record. In doing so, they promote the inclusion of information that may not be collected elsewhere in the medical record; however, they also allow information to be documented inconsistently across patients,23 potentially compromising the accuracy and completeness of the record. Discrete fields address issues of consistency by allowing users to select from a finite list of relevant options, often provided in a drop-down menu or checkbox format. This allows for rapid data recording, producing a document that is more easily completed while simultaneously delivering care. Unfortunately, this is prone to user errors such as the “adjacency error,” where providers accidentally select an option next to the intended option.24 Creating the optimal options list is also difficult; long lists of detailed options may increase accuracy, but time constraints may lead busy clinicians to select the first option that appears adequate. Conversely, general categories and options such as “other” or “not otherwise specified,” while easy to select, provide less meaningful information for both primary and secondary users of the data. If the options provided are deemed inadequate by users, they may also opt to enter the relevant information in free-text locations instead. One must be cognizant of these trade-offs when attempting to draw conclusions from EMR data.
A basic understanding of forced fields and how they impact the entry of health data are also critical in understanding issues with data quality. Forced fields, also referred to as “required items” or “hard stops,” demand that certain data elements be entered before the record is closed. Intuitively, requiring specific information to be input into corresponding data fields leads to a more complete and comprehensive record. Unfortunately, increasing the number of forced fields makes completing a medical record a more cumbersome task. If used excessively, forced fields may end up being completed with invalid information, resulting in a more “complete” but less accurate record. Clinical productivity may also be negatively impacted; providers who are unsure of the answer to a forced field may be unable to advance to further documentation until the field is completed.24
Incorporating knowledge of the formatting and nature of medical record fields into the context of this case study helps to illustrate how EMR system configurations can impact data collection and capture. The airway removal fields utilized to obtain the initial study data were configured as discrete fields to promote consistency but were not forced fields within the anesthesia EMR. Thus, users often passed over these fields, neglecting to select one of the airway removal options. This led to substantial data omission both in the medical record and within the initial data extraction; airway removal timing was unknown for almost 15% of the study cases. Much of this information was obtainable via manual chart review, however. Users passing over the discrete fields often input the airway removal information into the free-text “case comments” section, a potential indication that some providers either felt the discrete options inadequately represented the airway removal technique used for the case, or that the field used to document extubation was too cumbersome to locate.
Understanding how forced, free-text, and discrete fields impact data quality can enable the successful utilization of EMR data, regardless of whether it is being used for clinical decision making or for research purposes. The following is a list of recommendations to facilitate successful data extraction:
- Researchers without clinical experience should contact the system’s administrator to obtain access to a test record; this should be created on the current level of the software being used and will allow for familiarization with the configuration of the EMR system.
- After determining the configuration of the data elements being investigated, researchers should inquire about the configuration changes since EMR implementation, especially with reference to the study period.
- Understand that if the desired information is contained within free-text fields, collecting these data will often require manual chart review and the interpretation of nuanced information.
- If data are retrieved from a field that is not forced, it is important to understand that there may be a substantial amount of missing data. This may be remediable through manual chart review or the use of certain statistical methods (imputation, sensitivity analyses, etc).25
- It is important to remember that a physician’s main focus is to provide exceptional patient care; busy clinicians are more likely to commit documentation errors when entering items while simultaneously delivering clinical care.26
Standardized Variable Definitions: Why Are They Valuable and How Do They Impact Data Quality?
Establishing standardized variable definitions is important because the value of the data entered into EMRs is heavily dependent on clinicians’ agreement about the meaning of each data element. Unfortunately, definitions and semantics of patient and health care information can vary between physicians and organizations. Without standardized definitions, the meaning of the information documented is often ambiguous, and the validity of the health information entered becomes questionable.27 Intuitively, having standardized definitions produces a more accurate health record, facilitating the use of the health data for clinical decision making or for research purposes. In addition, it augments the value of many configurational changes; the addition of a forced field, for example, is only useful if there is a consensus among clinicians about what the information entered into the field actually means.
To fully understand how standardized variable definitions can impact EMR data, it is helpful to again refer to the extubation status data collected in this case study. If the case was managed with an LMA and the LMA was removed before the patient left the operating room, 1 of the 2 checkbox options for LMA removal (awake or deep) should be selected by the provider, as part of the “End of Case Documentation.” When the patient is transported for recovery, the “Transport Airway Documentation” field provides a checkbox to note that the patient was transported “Extubated.” A disagreement has ensued about whether LMAs are simply “removed” or formally “extubated” such that providers managing a case with an LMA document this information inconsistently; some, as was discovered by the research team, used the “Extubation” field, while others documented using a free-text field elsewhere in the record.
Logic dictates that if standardized variable definitions improve the consistency, accuracy, and completeness of documentation, then standardized definitions should exist for every variable within EMRs. However, it is important to recognize that the complexity of EMR components varies greatly. While some variables are simpler and less subjective by nature (ie, the number of intubation attempts), other variables are considerably more complex and not easily definable (ie, mental status). Furthermore, because of the individual nature of each patient, standardized definitions do not completely eliminate gray areas. ASA classification, for example, while having a definition for each level, remains a frequent topic of debate between clinicians. Additionally, those responsible for configuring the EMR may not realize that certain fields are subject to different interpretations until after the product has gone online.
Approaching the use of EMR data with an understanding of how standardized variable definitions impact data quality is critical for both clinicians and researchers. The recommendations below demonstrate how to use an understanding of standardized definitions to make meaningful use of EMR data:
- Determine whether there are standardized variable definitions for the data points being collected. This provides a better understanding of what the data being examined actually mean (ie, what the provider meant when he or she entered the data element).
- Determine whether education has been provided regarding the standardized variable definitions; although a data dictionary may exist for your EMR system, this has no bearing if users are unaware of it.
- Determine when the education regarding standardized variable definitions occurred. The meaning of certain data points may have changed if this education occurred during your study period.
- Understand that while standardized variable definitions decrease ambiguity, they do not completely eliminate it. Variables with somewhat nebulous definitions, like ASA classification, may still contain inaccuracies.
The Importance of Enhanced Communication With HIM Teams
The importance of open communication between researchers and HIM professionals cannot be overemphasized. While clinicians may have a better idea of both the subject matter under study and the clinical processes involved, HIM professionals typically have a more comprehensive understanding of where to extract the desired data as well as the quality of the data being extracted. Utilizing both perspectives to their fullest extent is crucial to successfully using EMR data for research.
Ideally, each department within a facility would employ clinical staff dually trained in clinical informatics. These individuals would have an understanding of both the technical and clinical aspects of the EMR. Unfortunately, dually trained individuals are not yet readily available, and for smaller departments, they may not be the best use of resources. It can, however, be very advantageous to develop HIM professionals and researchers who understand the other side of the data enough to productively collaborate. Clinicians with a basic understanding of EMR systems and HIM professionals with a basic understanding of the clinical topics being researched would significantly reduce the possibility for EMR data being erroneously used because of misunderstandings. Further, departmental research committees, if established, should be structured to include a clinician with some level of expertise with the facility’s EMR; these committees can serve not only to help investigators optimize their study design protocols but also provide consultation on the data collection process to ensure the data-gathering methods are also optimized.
The initial airway device data in our case study illustrates how a lack of open communication, combined with a lack of basic HIM knowledge on the part of a research team, can negatively impact data collection. Our clinical research team interpreted the cases listing “none” for airway device to mean that an airway device was not placed. After extensive chart review, however, it was discovered that the HIM team had used the term “none” to indicate a blank field or null query result when extracting the airway device data. The confusion caused by this misunderstanding delayed data analysis and could have easily been avoided if more detailed communication about potential output options between the research and HIM teams had occurred.
Recommendations regarding enhancing communication with your HIM team are as follows:
- Use the information provided in this article, along with additional research to gain a basic understanding of EMR systems and the data they contain; this will allow for more productive communication with the HIM team and more precise data requests.
- Schedule frequent team meetings when developing strategies for data pulls to ensure that both groups of professionals understand the goals of the project and the processes involved in extracting requested data. Potential sources for error, such as differences in terminology, can be easily ameliorated during these meetings, leading to improved efficiency and accuracy of the data management phase of the study.28
- Understand that HIM professionals are only extracting the data that EMRs contain: the old adage, “Garbage In, Garbage Out” still applies. If the medical record is incomplete, inconsistent, and inaccurate, the data output may also be incomplete, inconsistent, and inaccurate. Further to this point, incomplete medical records can also produce complete data sets that are simply inaccurate, leading to a potential increase in type I and type II errors.
- Manually review a small cohort of records to validate the integrity of any data extraction and safeguard against discrepancies.29
Issues of record incompleteness, inconsistency, and inaccuracy are not unique to our EMR system; they are common challenges for every institution.14,15,30 Unfortunately, information describing the risks of using EMR data, although readily available, are not found in sources typically frequented by anesthesia providers. Improved care coordination, patient safety, and quality of care31 are a few of the many benefits of EMR adoption often touted; however, it is critical that anesthesiologists understand the significance of the data quality issues that arise from inaccurate, incomplete, and inconsistent data entry and how this can undermine central EMR functions.13 Additionally, although most organizations, including our own, deploy HIM teams for ongoing EMR configuration and audits, it is important for clinicians and researchers to recognize that present issues with data quality still have the potential to lead to inaccurate conclusions.
The examples cited in this article should motivate anesthesia providers to approach EMRs with an increased awareness regarding accurate and complete documentation. Anesthesia departments should also be consistently monitoring their EMR systems for opportunities to streamline provider documentation while improving data accuracy and completion. The importance of this cannot be understated, because no purely technical solution exists that can overcome inaccurate data capture; the success of EMR research relies on the presence of accurate data.32 We encourage researchers using EMR data to review the variables studied; to gain an understanding of how, and in what context, documentation occurs33; to explore what alternatives are used; and to increase collaboration with providers and the HIM professionals, if available. These strategies will help ensure that one understands and rightfully acknowledges the limitations associated with EMR exploration and the implications these limitations may have on analysis and conclusions drawn.
It is important to remember that the primary purpose of EMRs is to facilitate patient care, and that the shortcomings of EMR data have implications that extend far beyond data mining for research. As medical decision making becomes increasingly reliant on information technologies,34,35 patient care will suffer if that reliance is on data that are inaccurate or incomplete.13,36–39 To address these problems and promote higher data quality within EMRs, a shift to a more proactive approach that improves the comprehensiveness and accuracy of health data upon entry by providers is necessary.30 Equipped with the information provided in this case study, providers and researchers alike will be better suited to recognize the limitations of the data available and therefore more appropriately utilize this information.
Future analyses should focus on the proper implementation of EMRs that optimizes compliance and data collection. This will not only accomplish the goal of increasing the usability of the data for research purposes, but it will also promote quality of care and the potential for EMR data to support overall population health.
Name: Amanda W. Baier, MPH.
Contribution: This author helped in data management and analysis, manuscript composition and editing, and literature review.
Name: Daniel J. Snyder.
Contribution: This author helped in data management and analysis, manuscript composition and editing, and literature review.
Name: Izabela C. Leahy, MS, RN, BSN.
Contribution: This author helped in study design/idea generation, manuscript composition, and editing.
Name: Lance S. Patak, MD, MBA.
Contribution: This author helped in manuscript composition and editing.
Name: Robert M. Brustowicz, MD.
Contribution: This author helped in study design/idea generation, manuscript composition, and editing.
This manuscript was handled by: Nancy Borkowski, DBA, CPA, FACHE, FHFMA.
1. Sahoo U, Bhatt A. Electronic data capture (EDC)—a new mantra for clinical trials. Qual Assur. 2003;10:117–121.
2. Safran C, Bloomrosen M, Hammond WE. Toward a national framework for the secondary use of health data: an American Medical Informatics Association White Paper. J Am Med Inform Assoc. 2007;14:1–9.
3. Weiskopf NG, Weng C. Methods and dimensions of electronic health record data quality assessment: enabling reuse for clinical research. J Am Med Inform Assoc. 2013;20:144–151.
4. Buntin MB, Burke MF, Hoaglin MC, Blumenthal D. The benefits of health information technology: a review of the recent literature shows predominantly positive results. Health Aff (Millwood). 2011;30:464–471.
5. Jones SS, Rudin RS, Perry T, Shekelle PG. Health information technology: an updated systematic review with a focus on meaningful use. Ann Intern Med. 2014;160:48–54.
7. Silow-Carroll S, Edwards JN, Rodin D. Using electronic health records to improve quality and efficiency: the experiences of leading hospitals. Issue Brief (Commonw Fund). 2012;17:1–40.
8. AHIMA. HIM functions in healthcare quality and patient safety. J AHIMA. 2011;82:42–45.
9. Zeng X, Reynolds R, Sharp M. Redefining the roles of health information management professionals in health information technology. Perspect Health Inf Manag. 2009;6:1f.
10. AHIMA. HIM and health IT: discovering common ground in an electronic healthcare environment. J AHIMA. 2008;79:69.
11. Meeks DW, Smith MW, Taylor L, Sittig DF, Scott JM, Singh H. An analysis of electronic health record-related patient safety concerns. J Am Med Inform Assoc. 2014;21:1053–1059.
12. West SL, Blake C, Liu Z, McKoy JN, Oertel MD, Carey TS. Reflections on the use of electronic health record data for clinical research. Health Informatics J. 2009;15:108–121.
13. Madden JM, Lakoma MD, Rusinak D, Lu CY, Soumerai SB. Missing clinical and behavioral health data in a large electronic health record (EHR) system. J Am Med Inform Assoc. 2016;23:1143–1149.
14. Bayley KB, Belnap T, Savitz L, Masica AL, Shah N, Fleming NS. Challenges in using electronic health record data for CER: experience of 4 learning organizations and solutions applied. Med Care. 2013;51:S80–S86.
15. Hersh WR, Weiner MG, Embi PJ. Caveats for the use of operational electronic health record data in comparative effectiveness research. Med Care. 2013;51:S30–S37.
16. Padilla MA, Algina J. Type I error rates for a one factor within-subjects design with missing values. J Mod Appl Stat Methods. 2004;3:406–416.
17. Rostami R, Nahm M, Pieper CF. What can we learn from a decade of database audits? The Duke Clinical Research Institute experience, 1997–2006. Clin Trials. 2009;6:141–150.
18. Hogan WR, Wagner MM. Accuracy of data in computer-based patient records. J Am Med Inform Assoc. 1997;4:342–355.
19. Hemilä H, Al-Biltagi M, Baset AA. Retraction: vitamin C and asthma in children: modification of the effect by age, exposure to dampness and the severity of asthma. Clin Transl Allergy. 2012;2:6.
20. Byar DP. Problems with using observational databases to compare treatments. Stat Med. 1991;10:663–666.
21. McDonald CJ, Hui SL. The analysis of humongous databases: problems and promises. Stat Med. 1991;10:511–518.
22. Terry AL, Chevendra V, Thind A, Stewart M, Marshall JN, Cejic S. Using your electronic medical record for research: a primer for avoiding pitfalls. Fam Pract. 2010;27:121–126.
24. Institute of Medicine Committee on Patient Safety and Health Information Technology. Health IT and Patient Safety: Building Safer Systems for Better Care. 2011.Washington, DCNational Academies Press
25. Wells BJ, Chagin KM, Nowacki AS, Kattan MW. Strategies for handling missing data in electronic health record derived data. EGEMS (Washington, DC). 2013;1:1035.
26. de Lusignan S, van Weel C. The use of routinely collected computer data for research in primary care: opportunities and challenges. Fam Pract. 2006;23:253–263.
27. Opmeer BC. Electronic health records as sources of research data. JAMA. 2016;315:201–202.
28. Bellazzi R, Diomidous M, Sarkar IN, Takabayashi K, Ziegler A, McCray AT. Data analysis and data mining: current issues in biomedical informatics. Methods Inf Med. 2011;50:536–544.
29. Hripcsak G, Knirsch C, Zhou L, Wilcox A, Melton G. Bias associated with mining electronic health records. J Biomed Discov Collab. 2011;6:48–52.
30. Botsis T, Hartvigsen G, Chen F, Weng C. Secondary use of EHR: data quality issues and informatics opportunities. Summit Transl Bioinform. 2010;2010:1–5.
31. Blumenthal D. Stimulating the adoption of health information technology. N Engl J Med. 2009;360:1477–1479.
32. Weiner MG, Embi PJ. Toward reuse of clinical data for research and quality improvement: the end of the beginning? Ann Intern Med. 2009;151:359–360.
33. Dean BB, Lam J, Natoli JL, Butler Q, Aguilar D, Nordyke RJ. Review: use of electronic medical records for health outcomes research: a literature review. Med Care Res Rev. 2009;66:611–638.
34. Koppel R, Metlay JP, Cohen A. Role of computerized physician order entry systems in facilitating medication errors. JAMA. 2005;293:1197–1203.
35. Anand V, Carroll AE, Downs SM. Automated primary care screening in pediatric waiting rooms. Pediatrics. 2012;129:e1275–e1281.
36. Kohn LT, Corrigan JM, Donaldson MS. To err is human: building a safer health system.A Report of the Committee on Quality of Health Care in America (Institute of Medicine Report). 2000.Washington, DCNational Academies Press
37. Ridgely MS, Greenberg MD. Too many alerts, too much liability sorting through the malpractice implications of drug-drug interaction clinical decision support. St Louis University J Health Law Policy. 2012;5:257–296.
38. Leviss J, Charney P. HIT or Miss: Lessons Learned From Health Information Technology Implementations. 2013.2nd ed. Chicago, IL: American Health Information Management Association.
39. Hasan S, Padman R. Analyzing the effect of data quality on the accuracy of clinical decision support systems: a computer simulation approach. AMIA Annu Symp Proc. 2006:324–328.