Purpose: The purpose of this article was to discuss common issues associated with large databases and present possible solutions to improve the quality and usefulness of large database research.
Background: The volume of electronic healthcare-related data is growing exponentially. Some of these data are being stored in registries and administrative databases. These data repositories are increasingly common and can serve as sources of nurse-driven research and quality improvement activities. Although these large databases have a wealth of useful information, they have limitations that may bias results. These include missing data and cases, data accuracy and validity, and the statistical effect of large samples.
Description: Researchers using large databases to address quality, safety, clinical, or systems issues have a variety of available techniques to deal with data issues. Proper data cleaning activities such as screening, visualization, and outlier/inlier identification are essential for addressing inaccurate values within large data sets. Common methods for addressing missing data include case analyses and various imputation techniques. Statistical approaches such as risk reductions and effect size are also useful when working with large sample sizes.
Conclusion/Implications: Registries and administrative databases provide healthcare researchers with increasing opportunities to address a wide variety of important practice and patient care questions. Healthcare researchers are encouraged to explore large data sets as they look for ways to improve patient safety and quality care, develop evidence-based practice guidelines, and fulfill regulatory and accreditation requirements.