Kim, Katherine K. MPH, MBA*; McGraw, Deven JD, MPH, LLM†; Mamo, Laura PhD‡; Ohno-Machado, Lucila MD, PhD§
*Health Equity Institute, San Francisco State University, San Francisco, and Betty Irene Moore School of Nursing, University of California Davis, Sacramento, CA
†Center for Democracy & Technology, Health Privacy Project, Washington, DC
‡Department of Health Education, Health Equity Institute, San Francisco State University, San Francisco
§Division of Biomedical Informatics, University of California San Diego, La Jolla, CA
Supplemental Digital Content is available for this article. Direct URL citations appear in the printed text and are provided in the HTML and PDF versions of this article on the journal's Website, www.lww-medicalcare.com.
Supported by Agency for Healthcare Research and Quality (AHRQ) Grant R01 HS19913-01, and the AcademyHealth EDM Forum.
The authors declare no conflict of interest.
Reprints: Katherine K. Kim, MPH, MBA, San Francisco State University, Health Equity Institute, 1600 Holloway Avenue, HSS 359, San Francisco, CA 94132. E-mail: firstname.lastname@example.org.
Comparative effectiveness research (CER) has emerged as a major, national priority in the drive to improve individual and population health. In 2009, Congress devoted $1.1 billion for CER in the American Recovery and Reinvestment Act (ARRA).1 ARRA further dedicated $27 billion dollars to encourage the adoption of electronic medical records, thereby increasing potential data sources for CER.2 The Patient-Centered Outcomes Research Institute, established by Congress in the Patient Protection and Affordable Care Act of 2010, was created to identify research priorities and conduct research that compares the effectiveness of medical treatments.3
However, a recent IOM roundtable report suggested that the slow pace and limited quality of clinical research, including CER, limited the effective translation of evidence into current clinical practice.4 Clinical researchers face obstacles to conducting CER including the inadequacy of the size of the sample that is needed to answer basic CER questions at a single site and an inability to access existing data that are fragmented and siloed in multiple sites.5 By using existing sources, CER data can be accrued more quickly and efficiently from several institutions, potentially having a more timely impact on care.
Distributed research networks (DRNs) are emerging as a promising model for defragmenting and delivering data while managing privacy risks in research.6 A DRN is a computer network in which “data stewards” maintain data in their own environment while allowing access through controlled network functions rather than directly integrating between computer systems or exporting datasets.7 Questions have been raised about whether it is possible to conduct CER accessing data at multiple sites and what rules or procedures are needed to govern the who, what, when, where, and how of that access to best assure responsible conduct of research. To address these issues, a framework for DRN governance, a “system of administration and supervision through which research is managed, participants and staff are protected, and accountability is assured” is a necessary foundation.8
There are several examples of well-known international and US-centric privacy frameworks that have been applied to governance of research networks and databases. One set of principles for shared data governance is from the Organization for Economic Co-operation and Development (OECD) Guidelines on Human Biobanks and Genetic Research Databases. The OECD recommends the following principles: be transparent and accountable, articulate a governance structure and management responsibilities and make them public, ensure that rights and well-being of participants prevails over research interests, and maintain oversight that complies with legal and ethical principles.9 However, the OECD principles do not offer specific guidance for DRNs or other kinds of electronic data sharing. The OECD governance principles are consistent with another widely known framework, Fair Information Practice Principles (FIPPs), which US policymakers and industry often rely on to develop policies and best practices for stewardship of sensitive personal information including: stating the purpose for collecting information; limiting the collection and use of the information to the minimum necessary; being open and transparent about the information collected about individuals; adopting reasonable security protections; and creating a system of accountability for abiding by laws and policies governing data use and disclosure.10 A framework promulgated by the Office of the National Coordinator for Health Information Technology’s (ONC), Nationwide Privacy and Security Framework For Electronic Exchange of Individually Identifiable Health Information is based on FIPPs.11 The ONC framework focuses on HIE primarily for health care and does not comprehensively address the secondary use of data for networked research.
Identifying appropriate policy frameworks and governance principles to ensure legal, ethical, and socially responsible conduct of research using DRNs is needed.12 The aim of this article is to describe the initial development of a flexible, ethical policy framework to govern the Scalable National Network for Effectiveness Research (SCANNER) project whose goal is to develop and demonstrate a scalable, flexible technical infrastructure for DRNs that enables near real-time CER. The SCANNER policy framework may serve as an example of how a multistate DRN can operate within privacy and security laws and best practice principles. The ability to leverage such a DRN is important to clinician and public health participants who rely on CER to drive effective health improvement and researchers who desire to conduct efficient and timely translational research.
The policy development project within SCANNER utilized the ONC’s FIPPs-based framework to organize applicable state and federal laws and regulations related to privacy, confidentiality, and security. The resulting privacy and security matrix was applied to 3 CER use cases. An expert panel provided input into policy analysis and development. Finally, the legal and broader governance principles identified in the framework and through expert panel discussion were used to inform and establish requirements for development of SCANNER technology to manage and monitor the network.
Privacy and Security Matrix
A matrix was created to identify requirements for privacy and security in each state where SCANNER participating organizations are located as well as in federal law. The matrix was organized according to the articulation of the FIPPs set forth in the ONC Nationwide Privacy and Security Framework For Electronic Exchange of Individually Identifiable Health Information.11 Although the ONC principles apply to identified data for purposes of care, we extended these principles and applied them to research data. Legal and regulatory references were found by searching West Law Statute, federal Web sites for NIST and HHS, state Web sites, and the California Office of Health Information Integrity’s Health Information Library. Security requirements were based on previous work by the California Privacy and Security Advisory Board of the California Department of Health and Human Services.13 References were verified by a second researcher before inclusion in the final privacy and security matrix.
A volunteer expert panel of 7 members was convened in 5 meetings in person and through teleconference over 7 months to provide input to the policy framework and review findings. The members were selected based on their expertise in privacy or security, HIE, or implementation of data sharing in clinical, research, and public health settings. They represented consumer advocacy and research organizations, public health and state government agencies, and provider organizations.
CER Use Cases
Three use cases were selected to illustrate the range of potential policy issues that might arise if the CER studies were implemented on the SCANNER network. The use cases were selected because they addressed important clinical conditions (cardiovascular disease, diabetes/hypertension), represented a range of possible data sharing scenarios [summary data, limited dataset (LDS), and identified data], and involved research and clinical sites in multiple states (California, Illinois, Massachusetts, and federal Veteran’s Administration).
- Medication Surveillance (MS) is an observational CER project to monitor the effectiveness of antiplatelet and anticoagulant medications and conduct medication safety surveillance. Data are collected from EHRs with an IRB-approved waiver of consent. Data are analyzed locally with only aggregate results and statistical summaries of outcomes accessible through SCANNER. The sites include Brigham and Women’s Hospital in Massachusetts, University of California San Diego (UCSD), and federal Tennessee Valley Health System-Veteran’s Administration.
- Medication Therapy Management (MTM) is a clinical trial comparing diabetic and hypertensive patients treated with standard of care versus those whose medications are comanaged by a physician and a pharmacist. Patients are consented and a LDS as defined in the Health Insurance Portability and Accountability Act of 1996 (HIPAA) including dates and ages is accessible through SCANNER. The initial sites are UCSD clinics but there are plans to expand to external sites.
- Behavioral Economics to Improve Treatment of Acute Respiratory Infections (BEARI) led by University of Southern California and RAND, tests the effects of behavioral economics principles on physician behavior in the use of antibiotics in acute respiratory tract infections. This is a separately funded study which is used for illustrative purposes. A LDS on patients and fully identified data on research subjects (physicians) are shared and analyzed. Nonpatient research data is not subject to HIPAA and confidentiality of physician information is typically not covered by state medical privacy laws. Sites include Partners HealthCare Affiliates in Massachusetts, Northwestern University Hospital in Illinois, and California clinics including AltaMed Community Clinics (42 sites), QueensCare Family Clinics (27 sites), and Children’s Clinic of Long Beach Memorial Hospital (6 sites).
The 3 use cases were analyzed with the privacy and security matrix to: (i) determine the specific legal obligations that apply to each institution with respect to data access and sharing; (ii) identify any potential conflicts in law; (iii) determine whether each entity—or the network as a whole—needs to adopt additional policies to ensure compliance with law and FIPPs and if so, (iv) develop a governance mechanism for assuring accountability. The analysis was conducted iteratively and new issues were researched and added to the matrix as they were identified.
Privacy and Security Matrix
Access to identifiable health information for research using either clinical or claims information is subject to regulation at both the federal and state level. Federal regulations established to implement HIPAA set the conditions under which providers and plans can access, use, and disclose identifiable information for research purposes. In addition, the Common Rule regulates research with identifiable participant data conducted at facilities receiving federal research dollars. Both HIPAA and the Common Rule generally require patient authorization for the use of identifiable data for research purposes, although this requirement can be waived in certain circumstances. Research using data that is not readily identifiable, however, is subject to fewer (if any) legal constraints. Consequently, researchers often seek to access data in deidentified or less identifiable form.
Each institution is responsible for ensuring compliance with federal law and the laws of its state for internal access to information for the research set forth in each use case. A particular IRB is responsible for overseeing and managing the research uses within their institution or affiliated institutions through a delegated IRB agreement. These rules apply to a data holder accessing information in its own records (such as a provider accessing information in his or her medical records system) and the disclosure of information to other parties. A more in depth explanation of these rules is provided online (Legal Analysis, Supplemental Digital Content 1, http://links.lww.com/MLR/A504).
Although many of the state differences were relevant to individual institutional members of SCANNER, there were few directed at networks. Table 1 maps the ONC framework to applicable federal guidance and selected state-level differences. The detailed, annotated privacy (Matrix, Supplemental Digital Content 2, http://links.lww.com/MLR/A505) and security matrices (Matrix, Supplemental Digital Content 3, http://links.lww.com/MLR/A506) are available online. State rules regarding HIE were relevant to sharing of identified data, however, many of these rules were preliminary and had not been codified in law or regulation. For example, a key governance policy concern is the handling of patient consent for data sharing. California is currently testing the use of opt-in and opt-out authorization in HIE demonstration projects to inform state policy development. Massachusetts requires opt-in for HIE whereas Illinois requires opt-out except for certain sensitive health information such as mental health and AIDS/HIV.
Another important policy issue is authentication of user access to research data. Standards and guidelines are developed by the National Institute of Standards and Technology (NIST) for Federal computer systems. These standards and guidelines are issued by NIST as Federal Information Processing Standards for use government wide. These are often referenced in state guidelines as well. Illinois and Massachusetts required at least single-factor authentication (NIST level 2 authentication), which is often implemented as a unique user name and password. California, in contrast, recommends for state-funded HIE 2-factor authentication (NIST level 3 authentication), which requires an additional verification such as a 1-time code token in addition to user name and password for access to protected health information.
The research purpose and use case served to focus the analysis of which laws apply and the interpretation of the applicable law. Using the framework in Table 1, the SCANNER policy requirements were created to guide implementation of the technology to manage the trust relationships among institutions and users as well as monitor and enforce data sharing rules. Table 2 includes selected requirements. In SCANNER’s use cases, fully identifiable patient data remain under the control of the data suppliers. Only deidentified data or LDS are shared with other SCANNER participants. These use cases are consistent with other examples of DRNs.14
In the MS and MTM use cases, the results of research conducted at each institution are shared either in deidentified or LDS form. In BEARI, the results include identified research participants (ie, clinicians) but not patients. Such sharing can then take place with fewer federal regulatory hurdles. In addition, state laws governing health data disclosure are not relevant as they typically govern only identifiable patient data. If the data are deidentified, they can be shared without restrictions. Disclosure of an LDS does require a HIPAA-compliant data use agreement (DUA) and attempts to reidentify LDS or identified data would be prohibited in the DUA.
The requirements were aligned with SCANNER’s goal of creating and testing a flexible, scalable infrastructure for CER by designating minimum and optional specifications. The MS use case presented the fewest requirements. Basic requirements for participation in the network such as a signed network agreement and IRB approval, including approvals of data collected under a waiver of consent, apply to all use cases. The MTM use case yielded more requirements including the need for a DUA and screening for HIPAA identifiers to assure that identifiers include only dates and ages. The BEARI use case led to further requirements: audit trails for access to identified data and 2-factor authentication triggered by the sharing of identified subject data across institutions.
The complexity of the federal laws regulating research, the possibility of different interpretations of the law by different IRBs (in particular, whether the circumstances for a waiver of patient authorization are present), and the existence of different state laws and HIO policies contribute to the perception that conducting CER across multiple states is a challenge. Building an environment of trust among institutions will likely require the adoption of collective policies and best practices that fill perceived gaps in the law. Because there are few laws or guidelines specific to the governance of research networks, the SCANNER expert panel recommended the reinforcement of the FIPPs as the policy framework with a particular focus on assuring privacy and confidentiality for research participants’ data and trust among member institutions.
The SCANNER network policy is manifested in agreements or contracts and/or managed through operational or technical functionality. As all potential use cases may not be anticipated at the start of the network, this process of developing DRN policy applicable to use cases will be iterative. Three modes of data sharing in CER are represented by the use cases: sharing of aggregate or statistical summaries only, sharing of an LDS, and sharing of subject identified data. In this paper, we do not address the validity of CER methods using these different types of data. Rather, we focus on the policy needs related to the type of data sharing. Other types of data sharing beyond the 3 use cases might trigger different legal obligations on the part of the data holders and recipients. For example, a network with a single repository that serves as a research data source for multiple institutions from multiple states and allows redisclosure for subsequent studies may need to be governed by policies that satisfy the legal requirements of all participating states simultaneously. It is critical to perform an analysis of policy on a use case basis rather than attempting to create a one-size fits all policy.
The analysis identified several governance issues that are particularly relevant to CER DRNs that are also in line with issues identified in the literature. For example, clinicians and researchers must protect patient anonymity. A key policy issue affecting access to patient health information used as CER data, for example, is whether the data are identified, or reidentifiable even when certain fields are removed in compliance with HIPAA requirements for deidentification.15,16 Guarantees of privacy are not possible but DRNs may be considered business associates under HIPAA and have a responsibility to protect data from breaches. One strategy DRNs might use is deidentification of data. The HIPAA safe harbor method of deidentification involves removal of 18 identifiers including biometric or genetic data. The statistical method of deidentification allows identified data to be retained if a statistician certified that risk of reidentification was very small. There is a growing body of literature on methods of deidentification that should be considered in determining an appropriate strategy that maintains public trust..15,17,18
A second issue is whether and how to obtain patient consent for data sharing.19,20 DRNs by design generally expect that consent was addressed during data collection. If additional research is conducted on existing data, there are unresolved issues of ethical conduct in human subjects research that arise: can there be true informed consent to participate in research that is not yet determined? How can patients understand the uses of their personal information and associated risks and benefits in this technical environment? To overcome this challenge, we found that there is an emerging literature on the nature of informed consent when identifiable data are to be shared and when the future uses and/or risks and benefits are not clear.21–23 Although consent is obtained by the data provider, networks need to consider possible liability if the provider’s consent did not explicitly cover network access. There is also conflicting evidence regarding whether the requirement to obtain consent in research leads to biased samples.24 Some studies found that consent did not lead to biased samples in adults,25 whereas others found systematic differences in consent rates among nonwhite and female respondents in DNA study26 and bias against children with higher levels of tooth decay.27 A DRN that relies on deidentified data which does not require informed consent may support a more inclusive sample and reduced risk of bias.
Third, researchers need to support data use by multiple users.6 Verifying the identity of and authorization for access of users is paramount and can be accomplished with several strategies depending on the type of access needed. In SCANNER, this will be accomplished with single-factor authentication within institution sharing and 2-factor authentication for between institution sharing. Additional network features such as audit trails and robust identity and access management will help to preserve public trust.
Although challenges exist in navigating the storage of and access to patient health information at institutions in different states and under different regulatory policies, managing the sharing of research data in an ethical and trustworthy manner is a core responsibility of a DRN. To create a cohesive policy framework for CER DRNs, researchers must review a variety of sources such as federal and multistate laws, regulations, and guidance documents that may be difficult to find, analyze, and reconcile. Institution-specific requirements add another layer of complexity. The need for flexibility in the development and implementation of policies must be balanced with responsibilities of data stewardship. The SCANNER framework offers 1 way to organize and understand the web of privacy and security requirements and the key policy issues that arise.
There is work in progress to develop frameworks for addressing governance issues at different levels such as efforts supported by the Patient Centered Outcomes Research Institute28 and the Agency for Healthcare Research and Quality’s Electronic Data Management Forum.29 There is still a gap in our understanding of patient perceptions about DRNs. In-depth understanding of these attitudes will enhance considerations of the development of patient-centered governance approaches. As DRNs expand and involve multistate collaborators, comparison of those states’ policies should also be undertaken and added to the matrix offered in this paper, providing a resource for other DRNs.
The SCANNER policy framework guides the development of network software that ensures that privacy and confidentiality are preserved in DRNs. This type of software system is complex and has little room for failure. During testing of SCANNER technology evaluation with prospective techniques, such as probabilistic risk assessment30 or failure modes and effects analysis (http://www.fmeainfocentre.com), will be important and will be incorporated in future work.
4. Olsen LIOM .Roundtable on Evidence-based Medicine. 2007; .Washington, DC:The Learning Healthcare System: Workshop Summary.
5. . Congressional Budget Office .Research on the Comparative Effectiveness of Medical Treatments: Issues and Options for an Expanded Federal Role. 2007; .Washington, DC:Congressional Budget Office.
6. Navathe AS, Clancy C, Glied S .Advancing Research Data Infrastructure for Patient-Centered Outcomes Research.JAMA. 2011; 306:1254–1255.
7. Brown J, Holmes J, Maro J, et al .Design Specifications for Network Prototype and Cooperative to Conduct Population-based Studies and Safety Surveillance. 2009; .Effective Healthcare Research Report 13. Rockville, MD:Agency for Healthcare Research and Quality.
8. Shaw S, Boynton PM, Greenhalgh T .Research governance: where did it come from, what does it mean? J R Soc Med. 2005; 98:496–502.
9. . Organization for Economic Co-operation and Development .OECD Guidelines on Human Biobanks and Genetic Research Databases. 2009; .Paris, France:Organization for Economic Co-operation and Development.
10. McGraw D, Dempsey JX, Harris L, et al .Privacy as an enabler, not an impediment: building trust into health information exchange.Health Aff. 2009; 28:416–427.
11. . Office of the National Coordinator for Health Information .Nationwide Privacy and Security Framework For Electronic Exchange of Individually Identifiable Health Information. 2008; .Washington, DC:Office of the National Coordinator for Health Information.
12. Diamond CC. Olsen L, Grossman C, McGinnis JM .Data and Information Hub Requirements.Learning What Works: Infrastructure Required for Comparative Effectiveness Research: Workshop Summary. 2011; .Washington, DC:Institute of Medicine, The National Academies Press; 163–172.
13. . California Privacy and Security Advisory Board Security Committee. Security Policies Analysis, 2011.
14. Brown JS, Holmes JH, Shah K, et al .Distributed health data networks: a practical and preferred approach to multi-institutional evaluations of comparative effectiveness, safety, and quality of care.Med Care. 2010; 48:S45–S51.
15. McGraw D .Building public trust in uses of Health Insurance Portability and Accountability Act de-identified data.J Am Med Inform Assoc. 2013; 20:29–34.
16. Sweeney L .k-Anonymity: a model for protecting privacy.Int J Uncertainty Fuzziness Knowl Based Syst. 2002; 10:557–570.
17. El Emam K, Arbuckle L, Koru G, et al .De-identification methods for open health data: the case of the heritage health prize claims dataset.J Med Internet Res. 2012; 14:e33
18. Malin B, Benitez K, Masys D .Never too old for anonymity: a statistical standard for demographic data sharing via the HIPAA Privacy Rule.J Am Med Inform Assoc. 2011; 18:3–10.
19. Hawkins A, O’Doherty K .“Who owns your poop?”: insights regarding the intersection of human microbiome research and the ELSI aspects of biobanking and related studies.BMC Med Genomics. 2011; 4:72–80.
20. Malin BA, Emam KE, O’Keefe CM .Biomedical data privacy: problems, perspectives, and recent advances.J Am Med Inform Assoc. 2013; 20:2–6.
21. McGuire A, Hamilton J, Lunstroth R, et al .DNA data sharing: research participants’ perspectives.Genet Med. 2008; 10:46–53.
22. Lemke A, Wolf W, Hebert-Beirne J, et al .Public and biobank participant attitudes toward genetic research participation and data sharing.Public Health Genomics. 2010; 13:368–377.
23. Fullerton S, Anderson N, Guzauskas G, et al .Meeting the governance challenges of next-generation biorepository research.Sci Transl Med. 2010; 2:15cm3–10.
24. Kho ME, Duffett M, Willison DJ, et al .Written informed consent and selection bias in observational studies using medical records: systematic review.BMJ. 2009; 338:b866–873.
25. Carter KN, Imlach-Gunasekara F, McKenzie SK, et al .Differential loss of participants does not necessarily cause selection bias.Aust N Z J Public Health. 2012; 36:218–222.
26. Meisel SF, Shankar A, Kivimaki M, et al .Consent to DNA collection in epidemiological studies: findings from the Whitehall II cohort and the English Longitudinal Study of Ageing.Genet Med. 2012; 14:201–206.
27. Davies G, Jones C, Monaghan N, et al .The caries experience of 5 year-old children in Scotland, Wales and England in 2007-2008 and the impact of consent arrangements. Reports of co-ordinated surveys using BASCD criteria.Community Dent Health. 2011; 28:5–11.
28. Selby J, Beal A, Frank L .The Patient-Centered Outcomes Research Institute (PCORI) National Priorities for Research and Initial Research Agenda.JAMA. 2012; 307:1583–1584.
29. Hamilton Lopez M, Holve E, Rein A, et al .Involving Patients and Consumers in Research: New Opportunities for Meaningful Engagement in Research and Quality Improvement. 2012; EDM Forum, AcademyHealth.
30. Marx DA, Slonim AD .Assessing patient safety risk before the injury occurs: an introduction to sociotechnical probabilistic risk modelling in health care.Qual Saf Health Care. 2003; 12:suppl 2 ii33–ii38.